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Cross Reference To Related Applications 
The present application claims the benefit of U.S. provisional patent application 
Serial No. 60/1 12,723 filed December 18, 1998 and U.S. provisional patent application 

5 Serial No. entitled "Production of Cyclic Peptides and Proteins In Vivo" filed 

October 7, 1999, both of which are incorporated herein by reference. 



Statement As To Federally-Sponsored Research 
This invention was made with Government support under grants GM13306 and 
GM19891 awarded by the National Institutes of Health. The Government may have 
10 certain rights in the invention. 



Field Of The Invention 
The invention relates to the field of biochemistry. More particularly, the invention 
relates to cyclic peptides, methods for the making cyclic peptides, and methods of 
screening cyclic peptides for particular characteristics. 



15 Background Of The Invention 

Small linear peptides are useful for investigating various physiological phenomena 
because they exhibit a wide range of biological activities and can be easily synthesized in 
almost infinitely variable sequences utilizing conventional techniques in solid phase synthesis 
and combinatorial chemistry. These qualities also make small linear peptides especially 

20 useful for identifying and developing new drugs. For example, large libraries of myriad 
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different small linear peptides can be prepared synthetically and then screened for a 
particular characteristic in various biological assays. E.g., Scott, J. K. and G. P. Smith, 
Science 249:386, 1990; Devlin, J. J., et al., Science 24:404, 1990; Furka, A. et al., Int. J. 
Pept. Protein Res. 37:487, 1991; Lam, K. S., et al., Nature 354:82, 1991. Those 
5 peptides within the library that exhibit the particular characteristic can then be isolated as 
candidates for further study. Microsequencing or other chemical analyses can then be used 
to characterize selected peptides by, for example, amino acid sequence. Despite these 
advantages, only a handful of small linear peptides have been developed into widely-used 
pharmaceutical drugs. One reason for this is that small linear peptides are usually cleared 

1 0 from the body too rapidly to be of therapeutic value 

Ring closure, or cyclization, can reduce the rate at which peptides are degraded in 
vivo and therefore dramatically improve their pharmocokinetic properties. The majority of 
cyclic peptides of known therapeutic value have been identified after isolation from natural 
sources (e.g., calcitonins, oxytocin, and vasopressin). Unfortunately, the pool of naturally- 

15 existing cyclic peptides that can be screened for a particular biological activity is inherently 
limited. And, moreover, the onerous steps required to isolate and purify cyclic peptides 
from natural sources render such screens costly and impractical. Thus, synthetic methods 
for producing large numbers of different peptides of infinitely variable amino acid sequences 
would greatly facilitate identifying particular cyclic peptides as candidates for new drugs. 

20 Various methods for producing cyclic peptides have been described. For example, 

chemical reaction protocols, such as those described in U.S. Patent Nos. 4,033,940 and 

4,102,877, have been devised to produce circularized peptides. In other techniques, 

biological and chemical methods are combined to produce cyclic peptides. These latter 

methods involve first expressing linear precursors of cyclic peptides in cells (e.g., bacteria) 
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to produce linear precursors of cyclic peptides and then adding of an exogenous agent such 
as a protease or a nucleophilic reagent to chemically convert these linear precursors into 
cyclic peptides. See, e.g., Camerero, J. A., and Muir, T. W., J. Am. Chem. Society. 
121:5597 (1999); Wu, H. et al, Proc. Natl. Acad. Sci. USA, 95:9226 (1998). 
5 Once produced, cyclic peptides can be screened for pharmacological activity. For 

example, a library containing large numbers of different cyclic peptides can be prepared 
and then screened for a particular characteristic, such as the ability to bind a specific target 
ligand. The library is mixed with the target ligand, and those members of the library that 
bind to the target ligand can be isolated and identified by amino acid sequencing. Similarly, 
10 libraries of cyclic peptides can be added to assays for a specific biological activity. Those 
cyclic peptides which modulate the biological activity can then be isolated and identified by 
sequencing. 

Unfortunately, because the step of identifying the active peptides can be difficult, 
these screening assays can prove laborious and time-consuming. For instance, screening 

15 assays usually mandate a reverse-mapping step because the actual amount of cyclic peptide 

that binds a target ligand or modulates a biological activity is usually so minute that it cannot 
be sequenced directly. To avoid this problem, a map indicating the physical location of the 
various cyclic peptides comprising a library can be made. Aliquots of cyclic peptides from 
the different locations are then transferred to corresponding locations within the screening 

20 assay; and those areas in the assay that exhibit the screened-for activity (e.g., binding or 

modulation of biological activity) are then mapped back to their corresponding location in 

the library. The cyclic peptides in that area of the library can then be isolated and 

sequenced. Difficulties arising from the need for spatial resolution and the limitations 

imposed by sample handling limit the number of candidate peptides that can be screened in 
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any given period of time. 

The number of peptides that can be screened in an assay can be dramatically 
increased by using cells that express the peptides. For example, bacteria engineered to 
express a library of linear peptides can be added to a screening assay, and those bacteria 
5 that express the screened for characteristic can be picked directly from the assay. The 
picked bacteria can then be reproduced to large numbers such that the selected linear 
peptides can made in large quantities to facilitate their identification (e.g., by sequencing) 
and production. Making and screening small linear peptide libraries in vivo has, however, 
proven to be troublesome because small linear peptides are rapidly degraded by normal 

10 cellular metabolic processes. Cyclization of the peptides can circumvent this problem by 
rendering the peptides stable within a cell. 

Despite this, heretofore, intracellular production of large libraries of cyclic peptides 
has not been feasible because general, easy-to-perform methods for cyclizing peptides in 
vivo have not been available. For example, a known method of producing cyclic peptides in 

15 vivo utilizes non-ribosomal peptide synthetase (NRPS) complexes (Cane et al, Science 
282:63, 1998). Such NRSP complexes are, however, neither facile to work with nor 
generally useful for the production of more than a single cyclic peptide at a time. Moreover, 
unlike ribosomal peptide synthesis where the linear sequence of monomers (amino acids) is 
dictated by the linear sequence of bases in the nucleic acid molecule encoding it, the linear 

20 sequence of monomers in a peptide made by the NRPS method is dictated by the subunit 

organization of the NRPS complex. Changing the sequence of a cyclic peptide made by 

NRPS entails cloning the subunit(s) which incorporate the desired monomers and 

introducing the subunit(s) into host cells already harboring all of the other necessary 

subunits. Making a library using this technique would require introducing combinations 
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(both in composition and order) of NRPS subunits to host cells and devising a method for 
ensuring that the subunits assemble into the correct supramolecular structures. 



Summary Of The Invention 

A general method for the in vivo production and screening of cyclic peptide 

libraries has been discovered. In this method, a nucleic acid molecule is constructed such 

that a nucleotide sequence encoding the peptide to be cyclized is flanked on one end with a 

nucleotide sequence encoding the carboxy-terminal portion of a split (or trans) intein (C- 

intein or I c ) and on its other end with a nucleotide sequence encoding the ammo-terminal 

portion of a split intein (N-intein or I N ). Expression of the construct within a host system 

such as a bacterium or eukaryotic cell results in the production of a fusion protein. The 

two split intein components (i.e., ^ and IJ of the fusion protein then assemble to form an 

active enzyme that splices the amino and carboxy termini together to generate a backbone 

cy. clic 

Mechanism of Intein Mediated Cyclization 
pe ptide. The 




o 
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Formation of the active intein from the amino and carboxy- terminal fragments stabilizes the 
ester isomer of an amino acid at the junction between the N-intein and the peptide to be 
cyclized (in B above, X=S or 0). When R=XH, the heteroatom from the C-intein is poised 
to attack the ester and generate a cyclic ester intermediate (C). Intein-catalyzed 
aminosuccinimide formation (D) liberates the cyclic peptide (in the lactone form), which 
spontaneously rearranges to form the thermodynamically favored backbone (lactam form) 
cyclic peptide product (E). This method can be adapted to facilitate the selection or 
screening of cyclic peptides with predetermined characteristics. 

Accordingly the invention features a non-naturally occurring nucleic acid molecule 
encoding a polypeptide having a first portion of a split intein, a second portion of a split 
intein, and a target peptide interposed between the first portion of a split intein and the 
second portion of a split intein. Expression of the nucleic acid molecule in a host system 
produces a polypeptide that spontaneously splices in the host system to yield a cyclized 
form of the target peptide, or a splicing intermediate of a cyclized form of the target peptide 
such as an active intein intermediate, a thioester intermediate, or a lariat intermediate. 

Both the first portion of a split intein and the second portion of a split intein can be 
derived from a naturally-occurring split intein such as Ssp DnaE. In other variations, one 
or both of split intein portions can be derived from non-naturally occurring split inteins such 
as those derived from RecA, DnaB, Psp Pol-I, and Pfu inteins. 

In another aspect, the invention features a non-naturally occurring nucleic acid 

molecule encoding a polypeptide having a first portion of a split intein, a second portion of a 

split intein, a third portion of a split intein, and fourth portion of a split intein. This molecule 

can have a first target peptide is interposed between the first portion of a split intein and the 

second portion of a split intein, and a second target peptide is interposed between the third 
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portion of a split intein and the fourth portion of a split intein. The first portion of a split 
intein can be complementary to the third portion of a split intein but not complementary to 
the second portion of a split intein, and the second portion of a split intein can be 
complementary to the fourth portion of a split intein but not complementary to the third 
5 portion of a split intein. 

Also within the invention is an expression vector comprising a nucleic acid molecule 
within the invention. Expression of the vector in a host system produces a polypeptide that 
spontaneously splices in the host system to yield a cyclic peptide or a splicing intermediate. 
The expression vector of the invention can also contain a regulatory sequence that facilitates 

10 expression of the polypeptide in the host system. The nucleic acid molecule of the vector 
can include a nucleotide sequence that encodes a peptide that facilitates screening of the 
cyclized form of the target peptide for a particular characteristic and/or a nucleotide 
sequence that encodes a peptide that facilitates purifying the cyclized form of the target 
peptide from the host system. The expression vector can also be inducible. 

15 In another aspect, the invention features an expression vector encoding a 

polypeptide having a target peptide that has a first end fused to a first portion of a split intein 
and a second end fused to a second portion of a split intein. Expression vectors of the 
invention can be a plasmid, a bacteriophage, a virus, a linear nucleic acid molecule, or other 
type of vector. 

20 The invention additionally features a substantially pure polypeptide having a first 

portion of a split intein, an second portion of a split intein, and a target peptide interposed 

between the first portion of a split intein and the second portion of a split intein. The 

polypeptide can be one that spontaneously splices in the host system to yield a cyclized 

form of the target peptide, or it can be a splicing intermediate. 
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Also within the invention is a host system harboring a nucleic acid molecule of the 
invention. The host system can be a prokaryote such as a bacterium, an archaebacterium, 
a eukaryote such as a yeast or a mammalian cell, a plant cell, an in vitro 
transcription/translation system, or a cell lysate. 
5 In another aspect, the invention features a method for making a peptide molecule. 

This method includes the steps of: providing an isolated nucleic acid molecule of the 
invention; providing a host system; introducing the isolated nucleic acid molecule into the 
host system; and expressing the isolated nucleic acid molecule. In one variation, the step of 
expressing the isolated nucleic acid molecule results in production of a polypeptide that 

10 spontaneously splices in the host system to yield the cyclized form of the target peptide. 
This method can also feature the step of purifying the cyclized form of the target peptide 
from the host system. 

In another variation of this method, the step of expressing the isolated nucleic acid 
molecule results in production of a splicing intermediate of a cyclized form of the target 

15 peptide. This method can also feature the step of purifying the splicing intermediate of a 
cyclized form of the target peptide from the host system. Yet another variation of this 
method, includes the step of foraiing the cyclic peptide from the splicing intermediate. 

In another aspect of this method, the target peptide is produced in a cyclized form 
in the host system in the absence of an exogenously-added agent such as a protease or a 

20 thiol. 

Another aspect of the invention is a method of preparing a library of peptide 

molecules. This method involves the steps of providing a plurality of nucleic acid molecules 

encoding a plurality of target peptides having heterogenous amino acid sequences; 

incorporating each of the plurality of nucleic acid molecules into an expression vector to 
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form a plurality of expression vectors, and expressing the expression vectors in the host 
system. The plurality of nucleic acid molecules is interposed between a nucleic acid 
molecule encoding a first portion of a split intein and a nucleic acid molecule encoding an 
second portion of a split intein in each of the formed expression vectors such that 
5 expression of the expression vectors in a host system results in the production of a plurality 
of peptide molecules such as polypeptides that spontaneously splice in the host system to 
yield cyclized forms of the target peptides, or splicing intermediates of cyclized forms of the 
target peptides 

And in yet another aspect, the invention includes a method of screening a peptide 

10 molecule for a predetermined characteristic. This method includes the steps of: providing a 

nucleic acid molecule that encodes a polypeptide comprising a first portion of a split intein, 

a second portion of a split intein, and a target peptide interposed between the first portion 

of a split intein and the second portion of a split intein; providing the host system; 

introducing the isolated nucleic acid molecule in the host system; placing the host system 

1 5 tinder conditions that cause the peptide molecule to be produced; and testing the peptide 

molecule for the predetermined characteristic. Expression of the nucleic acid molecule in a 

host system produces either a cyclized form of the target peptide resulting from 

spontaneously splicing of the polypeptide in the host system, or a splicing intermediate of a 

cyclized form of the target peptide. 

20 In one variation of this method, the predetermined characteristic includes the ability 

to specifically bind a target molecule, and the step of testing the peptide molecule for the 

predetermined characteristic includes the steps of (a) contacting the peptide molecule to the 

target molecule and (b) determining whether the peptide molecule binds to the target 

molecule. In another variation, the predetermined characteristic is the ability to modulate a 
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biochemical reaction, and the step of testing the peptide molecule for the predetermined 
characteristic comprises the steps of (a) contacting the peptide molecule to a system 
containing the biochemical reaction and (b) determining whether the peptide molecule 
modulates the biochemical reaction. The step of determining whether the peptide molecule 
5 binds to a target molecule or modulates a biochemical reaction can measured by observing 
a color change, a fluorescent signal, by analyzing the cell cycle or the reproduction of an 
organism. 

The target molecule in these methods can be a cell-associated molecule such as a 
membrane-associated molecule or an intracellular molecule (e.g., a nuclear molecule or one 
10 or more organelles such as mitochondria, lysosomes, endoplasmic reticula, chloroplasts, 
golgi, and periplasm). It can also be an extracellular molecule. 

The biochemical reaction can be a cell associated-process such as an intracellular 
metabolic event, a membrane-associated event, a nuclear event. It can also be an 
extracellular reaction. 

15 In these methods, the step of testing the peptide molecule for the predetermined 

characteristic can be performed using a hybrid system, and/or the step of immobilizing the 
peptide molecule on a solid phase support. 

The invention also features a method for purifying a cyclic peptide from a mixture. 
This method includes the steps of: providing a mixture containing a splicing intermediate 

20 conjugated with an affinity tag; mixing the conjugated splicing intermediate with a solid 

phase support having a ligand thereon that specifically binds the affinity tag such that the 

support becomes specifically bound with the splicing intermediate; washing the support to 

remove non-specifically bound matter from the support; adding to the support a reagent 

that makes a cyclic peptide from the splicing intermediate; and eluting the cyclic peptide 
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from the support. 

In a variation of the foregoing, the invention also includes a method for purifying a 
cyclic peptide from a mixture that includes the steps of: providing a mixture containing a 
splicing intermediate conjugated with an affinity tag; mixing the conjugated splicing 
5 intermediate with a solid phase support having a ligand thereon that specifically binds the 

affinity tag such that the support becomes specifically bound with the splicing intermediate; 
washing the support to remove non-specifically bound matter from the support; eluting the 
splicing intermediate from the support; and adding a reagent the eluted splicing intermediate 
that make a cyclic peptide from the splicing intermediate. 
10 Additionally, included in the invention is method for purifying a target molecule that 



binds a splicing intermediate from a mixture. This method includes the steps of: providing a 
solid phase support having the splicing intermediate specifically bound thereon; contacting 
the support with the target molecule in the mixture; washing the support to remove non- 
specifically bound matter from the support; and eluting the target molecule from the 



As used herein, the phrase "non-naturally occurring" means being directly or 
indirectly made or caused to be made through human action. Thus, a non-naturally 
occurring nucleic acid molecule is one that has been produced through human manipulation, 
and not natural evolutionary processes. 



15 



support. 



20 



By the phrase "nucleic acid molecule" is meant any chain of two or more 



nucleotides bonded in sequence. For example, a nucleic acid molecule can be a DNA or 



anRNA. 



As used herein, the term "peptide" means a chain of two or more amino acids 



bonded in sequence, and includes polypeptides and proteins. By "polypeptide" is meant 
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a polymer comprised of two or more peptides, regardless of length or post-translational 
modification. By "protein" is meant any chain of amino acids and includes peptides, 
polypeptides, proteins, and modified proteins such as glycoproteins, lipoproteins, 
phosphoproteins, metalloproteins, and the like. 
5 A "linear peptide" is a peptide that is not in a circular form, and generally has both a 

carboxy-terminal amino acid with a free carboxy-terminus and an ammo-terminal amino 
acid with a free amino terminus. 

In comparison, a "cyclic peptide" is a peptide that has been "cyclized." The term 
"cyclic" means having constituent atoms forming a ring. When referring to a peptide, the 
10 term "cyclize" means to make the peptide into a cyclic or "cyclized" form. Thus, for 

example, a linear peptide is "cyclized" when its free amino-terminus is covalently bonded to 
its free carboxy-terminus (i.e., in a head to tail format) such that no free carboxy- or amino- 
terminus remains in the peptide. 

As used herein, a "splicing intermediate" is a polypeptide generated during the 
15 intein-mediated cyclization reaction illustrated above prior to the formation of the liberated 
cyclic peptide product. Splicing intermediates include "active-intein intermediates" (i.e., 
those with a chemical structure similar to the polypeptide labeled "A" in the above 
illustration), "thioester intermediates" ( i.e., those with a chemical structure similar to the 
polypeptide labeled "B" in the above illustration), and "lariat intermediates" (i.e., those with 
20 a chemical structure similar to the polypeptide labeled "C" in the above illustration). 

By the phrase "target peptide" is meant a peptide to be cyclized or displayed in a 
splicing intermediate. For example, a peptide interposed between a carboxy-terminal 
portion of a split intein and an ammo-terminal portion of a split intein in a precursor protein 

would be a target peptide, if the peptide becomes cyclized upon splicing of the precursor 

SUBSTITUTE SHEET (RULE 26) 



WO 00/36093 PCT/US99/30162 

13 

protein or becomes a part of a splicing intermediate upon processing (e.g., folding) of the 
precursor protein. 

As used herein, the word "intein" means a naturally-occurring or artificially- 
constructed polypeptide sequence embedded within a precursor protein that can catalyze a 

5 splicing reaction during post-translation processing of the protein. A list of known inteins is 
published at http://www.neb.com/inteins.html. A "split intein" is an intein that has two or 
more separate components not fused to one another. 

As used herein, the word "interposed " means placed in between. Thus, in a 
polypeptide having a first peptide interposed between a second and a third peptide, the 

10 chain of amino acids making up the first peptide is physically located in between the chain 
of amino acids making up the second peptide and the chain of amino acids making up the 
third peptide. 

A plurality of peptides having "heterogenous amino acid sequences" means that the 
plurality of peptides is composed of at least two, but generally a large number of, different 
15 peptides of disparate amino acid sequence. 

As used herein, the phrase "host system" refers to any medium or vehicle in which a 
nucleic acid molecule can be transcribed, replicated, and/or translated; and/or any medium 
or vehicle in which a polypeptide can be spliced or otherwise post-translationally 
processed. 

20 As used herein, the word "spontaneously" means the action described occurs 

without the addition of an exogenous substance. For example, a precursor polypeptide 

within a host system spontaneously splices in the host system to yield a cyclic peptide when 

nothing is added to the host system other than the precursor polypeptide or a nucleic acid 

molecule encoding the precursor polypeptide. In comparison, a precursor polypeptide 
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within a host system does not spontaneously splice in the host system if an agent extraneous 
to the host system is required to generate the cyclic peptide. 

As used herein, the term "splice" or "splices" means to excise a central portion of 
the polypeptide to form two or more smaller polypeptide molecules. In some cases, 
5 splicing also includes the step of fusing together two or more of the smaller polypeptides to 
form a new polypeptide. 

As used herein, the word "derived" means directly or indirectly obtained from, 
isolated from, purified from, descended from, or otherwise arising from. 

As used herein, the phrase "expression vector" means a vehicle that facilitates 
10 transcription and/or translation of a nucleic acid molecule in a host system. An expression 
vector is "inducible" when adding an exogenous substance to a host system containing the 
expression vector causes the vector to be expressed (e.g., causes a nucleic acid molecule 
within the vector to be transcribed into mRNA). 

By the phrase "expression of* a nucleic acid is meant that the nucleic acid is 
15 transcribed and/or translated into a polypeptide and/or replicated. 

As used herein, the phrase "regulatory sequence" means a nucleotide sequence 
which modulates expression (e.g., transcription) of a nucleic acid molecule. For example, 
promoters and enhancers are regulatory sequences. 

By the term "fused" is meant covalently bonded to. For example, a first peptide is 
20 fused to a second peptide when the two peptides are covalently bonded to each other (e.g., 
via a peptide bond). 

As used herein an "isolated" or "substantially pure" substance is one that has been 

separated from components which naturally accompany it. Typically, a polypeptide is 

substantially pure when it is at least 50% (e.g., 60%, 70%, 80%, 90%, 95%, and 99%) by 
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weight free from the other proteins and naturally-occurring organic molecules with which it 
is naturally associated. 

A "progenitor DNA" is particular deoxyribonucleic acid from which mutations are 
made or based upon. 

5 By the phrase "target molecule" is meant any molecule used to determine the 

binding or functional characteristics of another molecule. 

Herein, "bind" or "binds" means that one molecule recognizes and adheres to 
another molecule in a sample, but does not substantially recognize or adhere to other 
molecules in the sample. One molecule "specifically binds" another molecule if it has a 
10 binding affinity greater than about 10 5 to 10 6 liters/mole for the other molecule. 

A "cell-associated process" is one that takes place within a cell or in the near 
vicinity of the cell. 

A "membrane-associated event" is a cell-associated process that takes place on the 
plasma membrane of a cell. 
15 A "nuclear event" is a cell-associated process that takes place in the nucleus of a 

cell. 

In comparison to a cell-associated event, an "extracellular reaction" is one that 

does not take place within a cell. 

By the phrase "hybrid system" is meant two-hybrid systems, reverse two-hybrid 

20 systems, one-hybrid systems, split-hybrid systems, small molecule hybrid systems and all 

like systems for identifying interactions between peptides and other molecules (e.g., 

proteins and nucleic acid molecules). For a review of exemplary hybrid systems, see Vidal 

and Legrain, Nucleic Acids Res. 27:919, 1999. 

Unless otherwise defined, all technical terms used herein have the same meaning as 
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commonly understood by one of ordinary skill in the art to which this invention belongs. 
Although methods and materials similar or equivalent to those described herein can be used 
in the practice or testing of the present invention, suitable methods and materials are 
described below. All publications, patent applications, patents, and other references 
5 mentioned herein are incorporated by reference in their entirety. In the case of conflict, the 
present specification, including definitions will control. In addition, the particular 
embodiments discussed below are illustrative only and not intended to be limiting. 

Other features and advantages of the invention will be apparent from the following 
detailed description, and from the claims. 



10 Brief Description Of The Drawings 

The invention is pointed out with particularity in the appended claims. The above 
and further advantages of this invention may be better understood by referring to the 
following description taken in conjunction with the accompanying drawings, in which: 

Figure 1 is a schematic illustration of an overview of a general cyclization reaction 
15 within the invention. 

Figure 2 is a schematic illustration of a series of chemical reaction steps that occur 
in a peptide cyclization method of the invention. 

Figure 3 is a genetic map of (a) plasmid pARCP, (b) plasmid pARCP-DHFR, (c) 
plasmid pARCPAH-DHFR, (d) a modified vector having a cysteine (TGY) or serine 
20 (TCN) codon generated by cloning into the Mfel site (N represents any nucleobase, S 

represents C or G and Y represents pyrimidines), ( e) plasmid pARCP-p, and (f) plasmid 
pARCBD-p. 

Figure 4 is a photograph of a sodium dodecylsulfate polyacrylamide gel electro- 
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phoresis (SDS-PAGE) analysis of dihydrofolate reductase (DHFR) cyclization on a 10- 



20% gradient, Tris/glycine ready-gel (Biorad). 



Figure 5 is a graph of DHFR activity of wild-type (triangles) and cyclic DHFR 



(diamonds) activity after preincubation at 65°C. 



5 



Figure 6 is a schematic illustration of the expected endoproteinase Lys-C digestion 



pattern for linear and cyclic DHFR. 



Figure 7 is a photograph of FeCuY plates used in an in vivo assay to detect 



tyrosinase inhibition by pseudostellarin F. 



Figure 8 is a schematic illustration of a method for purifying cyclic peptides within 



10 



the invention. 



Figure 9 is a schematic illustration of another method for purifying cyclic peptides 
within the invention. 

Figure 10 is a schematic illustration of a solid phase support/affinity 
chromatography-based method for identifying/purifying molecules which specifically bind a 



Figure 1 1 is a schematic illustration of another solid phase support/affinity 
chromatography-based method for identifying/purifying molecules which specifically bind a 
splicing intermediate. 

Figure 12 is a schematic illustration of another solid phase support/affinity 
20 chromatography-based method for identifying/purifying molecules which specifically bind a 
splicing intermediate. 

Figure 13 is a schematic illustration of another solid phase support/affinity 

chromatography-based method for identifying/purifying molecules which specifically bind a 
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splicing intermediate. 

Figure 14 is a schematic illustration of the use of aptamer scaffolds in the invention. 

Figure 15 is a schematic illustration of two reactions for preparing aptamers within 
the invention. 

5 Figure 1 6 is a schematic illustration of a method for screening within the invention. 

Figure 1 7 is a schematic illustration of another method for screening within the 
invention 

Figure 18 is a schematic illustration of another method for screening within the 
invention. 

I q, Figure 1 9 is a schematic illustration of another method for screening within the 

invention. 



Detailed Description 
The trans-splicing ability of split inteins has been exploited to develop a general 
method of producing cyclic peptides and splicing intermediates displaying peptides in a 

1 5 looped conformation. In this method, a target peptide is interposed between two portions 
of a split intein in a precursor polypeptide. In an appropriate host system, the two portions 
of the split intein physically come together to form an active intein in a conformation that 
also forces the target peptide into a loop configuration. In this configuration, the ester 
isomer of the amino acid at the junction between one of the intein portions (e.g., I N ) and the 

20 target peptide is stabilized such that heteroatom from the other portion of the intein (e.g., Ic) 

can then react with the ester to form a cyclic ester intermediate. The active intein then 

catalyzes the formation of an aminosuccinimide that liberates a cyclized form of the target 

peptide (i.e., a lactone form), which then spontaneously rearranges to form the 
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thermodynamically favored backbone cyclic peptide product (i.e., the lactam form). By 
arresting the reaction at given points before liberation of the cyclic peptide, splicing 
intermediates bearing the target peptide in a loop configuration can be produced. To 
produce such peptides, nucleic acid molecules encoding a polypeptide having the target 
5 peptide sequence interposed between the two intein portions can be constructed. 

Introduction of these constructs into an expression vector provides a method for producing 
the polypeptide in a host system, where the polypeptide can be spliced into a cyclic peptide 
or a splicing intermediate. Using this method, several different cyclic peptides or splicing 
intermediates can be prepared to generate a library of cyclized or partially-cyclized 

10 peptides that can be screened for particular characteristics. 

Referring to FIG. 1, an overview of an embodiment of the invention includes a 
method of making a cyclic peptide from a nucleic acid molecule. In this method, a nucleic 
acid molecule is prepared so that its nucleotide sequence encodes a polypeptide having in 
consecutive order a first portion of a split intein (e.g., I c ), a peptide to be cyclized (i.e., a 

15 target peptide), and a second portion of a split intein (e.g., I N ). The nucleic acid molecule 
can be incorporated into an expression vector to facilitate its expression in a host system 
where the nucleic acid can be transcribed and translated into a precursor polypeptide 
having the peptide to be cyclized interposed between the two split intein portions. By the 
steps described above, the two portions of the split intein come together and place the 

20 precursor peptide in a conformation that sets off chemical reactions that ultimately yield a 
cyclic peptide (see FIG. 2). 



Nucleic Acid Molecules 

Nucleic acids molecules within the invention include those that encode a 
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polypeptide 

having a first portion of a split intein, a second portion of a split intein, and a target peptide 
positioned in between the first portion of a split intein and the second portion of a split 
intein. In one embodiment of the invention, expression of the nucleic acid molecule in a host 
system results in a polypeptide that spontaneously splices in the host system to yield a 
cyclized form of the target peptide. In another embodiment of the invention, expression of 
the nucleic acid molecule in a host system results in a polypeptide that is a splicing 
intermediate of a cyclized form of the target peptide. The nucleic acids of the invention can 
be prepared according to the methods described herein, and can also be prepared using 
the guidance provided herein in conjunction with methods for preparing and manipulating 
nucleic acid molecules generally known in the art (See, e.g., Ausubel et al, eds., Current 
Protocols in Molecular Biology, New York: John Wiley & Sons, 1997; Sambrook et aL, 
Molecular Cloning: A laboratory Manual (2 nd Edition), Cold Spring Harbor Press, 1989). 
For example, a nucleic acid molecule within the invention can be made by separately 
preparing a polynucleotide encoding the first portion of a split intein, a polynucleotide 
encoding the second portion of a split intein, and a polynucleotide encoding the target 
peptide. The three polynucleotides can be ligated together to form a nucleic acid molecule 
that encodes a polypeptide having the target peptide interposed between the first portion of 
a split intein and the second portion of a split intein. 

Nucleic Acids Encoding Inteins 

Nucleotide sequences that encode the first portion of a split intein and the second 

portion of a split intein of the nucleic acid molecules within the invention can be derived 

from known inteins. A fairly comprehensive and descriptive list of such inteins is published 
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by New England Biolabs at ht1p//www.neb.com/inteins/int_reg.html. Any of these known 
inteins can be used as long as they are compatible with invention. 

Nucleotide sequences that encode either naturally-occurring or artificially-produced 
split inteins can be used to generate the intein portions of nucleic acid molecules within the 
5 invention. Naturally-occurring split inteins are expressed in nature as two separate 

components that bind one another to form one active splicing agent. The nucleic acid 
molecules encoding these naturally-occurring components can thus be used in the invention. 
One example of a naturally-occurring split intein that may be used is Ssp DnaE (Wu et al, 
Proc. Natl. Acad. Sci. USA 95:9226,1998). 

1 o Inteins that are not split in their natural state (i.e., those that exist as one continuous 

chain of amino acids) can be artificially split using known techniques. For example* two or 
more nucleic acid molecules encoding different portions of such inteins can be made so that 
their expression yields two or more artificially split intein components . See, e.g., Evans et 
al, J. Biol. Chem. 274:18359, 1999; Mills et al, Proc. Natl. Acad. Sci. USA 95:3543, 

15 1998. The nucleic acids that encode such non-naturally occurring intein components 

(portions) can be used in the invention. Those nucleic acid molecules that encode non- 
naturally occurring split intein portions which efficiently interact on the same precursor 
polypeptide to yield cyclic peptides or splicing intermediates are preferred. Examples of 
non-naturally occurring split inteins from which such nucleic acid molecules can be derived 

20 include Psp Pol-1 (Southworth, M.W., et al, The EMBO J. 17:91 8, 1998), 

Mycobacterium tuberculosis RecA intein, (Lew, B.M., et al, J. Biol. Chem. 273:15887, 

1998; Shingledecker, KL, et al, Gene 207:187, 1998; Mills, K.V., et al, Proc. Natl. Acad. 

Sci. USA 95:3543, 1998), Ssp DnaB/Mxe GyrA (Evans, T.C. et al, J. Biol. Chem. 

274:18359, 1999), and Pfii (Otomo et al, Biochemistry 38:16040, 1999; Yamazaki et al, 
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J. Am. Chem. SocT20:5591, 1998). 

Nucleic acids encoding Target Peptides or Peptides Displayed in Splicing 
Intermediates 

Numerous methods of making nucleic acids encoding peptides of a known or 
5 random 

sequence are known in art. For example, polynucleotides having a predetermined or a 
random sequence can be prepared chemically by solid phase synthesis using commercially 
available equipment and reagents. Polymerase chain reaction can also be used to prepare 
polynucleotides of known or random sequences. See, e.g., Ausubel et al, supra. As 
10 another example, restriction endonucleases can be used to enzymatically digest a larger 
nucleic acid molecule or even whole chromosomal DNA into a plurality of smaller 
polynucleotide fragments that can be used to prepare nucleic acid molecules of the 
invention. 

Polynucleotides that encode peptide sequences to be cyclized are preferably 
15 prepared so that one tenriinus of the polynucleotide encodes an asparagine, serine, 

cysteine, or threonine residue to facilitate the cyclization reaction. For the same reason 
polynucleotides that encode peptide sequences for production of splicing intermediates are 
preferably prepared so that the terminus encodes an amino acid other than an asparagine, 
serine, cysteine, or threonine residue so that the cyclization reaction is prevented. 

20 Ligation Polynucleotides Encoding Intein Portions and Target Peptides or Peptides 

Displayed in Splicing Intermediates 

Once generated, conventional methods can be used to ligate nucleic acid molecules 
encoding intein portions to a nucleic acid molecule encoding a target peptide (or peptide 
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within a splicing intermediate) to form a larger nucleic acid molecule encoding a polypeptide 
having the first intein portion-target peptide-second intein portion order. See, e.g., Ausubel 
et al, supra. 

5 Nucleic Acid Molecules that Encode Multiple Split Interns and Multiple Peptides 

Using techniques similar to those described above, one skilled in the art could also 
prepare nucleic acid constructs that encode more than one set of two portions of a split 
intein interposed with peptides. For example, the invention includes nucleic acids 

1 0 molecules encoding a precursor polypeptide molecules comprised of N polypeptides (N = 
an integer greater than or equal to 1) and having N target peptides interposed between 2N 
intein portions such that any target peptide i (i= an integer greater than 1 representing the 
position of an target peptide in the precursor polypeptide) is interposed between intein 
portion 2i-l and 2i (e.g. target peptide 1 is between intein portionl & 2, target peptide 2 is 

15 between intein portions 3 & 4 etc.). As long as intein portions 2i-l and 2i are not 

complementary (i.e. able to physically interact to catalyze a splicing event), target peptide i 
can not cyclize. If, however, intein portion 2i is complementary with intein portion 2i+l and 
intein portion 2N is complementary with intein portion 1, the entire ensemble of N 
polypeptides can perform N-l trans splices (between 2 polypeptides) andl cis splice 

20 (ligating the two ends together) to give rise to a product wherein 1-N target peptides are 

covalently attached to one another in a cyclic peptide/protein (e.g., intein portions 2 & 3 

trans-splice target peptides 1 & 2; intein portions four & five trans-splice target peptides 2 

& 3; intein portions 2N-2 & 2N-1 trans-splice target peptides N-l & N; and intein 

portions N & 1 cis-splice to close the cyclic product containing the N target sequences). 
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The order of trans/cis splicing events is irrelevant. The slowest splicing species (whether it is 
the complementary intein portion 2N&1, 2&3 or 80&81) will by default perform the cis- 
splice. 

Thus, nucleic acid constructs can be made that express two or more polypeptides 
5 each composed of a target peptide interposed between two portions of a split intein where 
the intein components are not complementary (i.e., do not derive from the same intein or 
otherwise cooperate to catalyze any of the cyclization reactions). In such constructs, no 
one polypeptide could be cyclized unless it was expressed in the presence of a second 
polypeptide having the appropriate complementary intein component. Constructs of such 
10 nucleic acids within the invention could encode only one polypeptide per construct or more 
than one polypeptide per construct (e.g., a bi-functional plasmid). 



Expression Vectors 

The expression vectors of the present invention can be prepared by inserting 

15 polynucleotides encoding a target peptide into any suitable expression vector that can 

facilitate expression of the polynucleotide in a host system. Such suitable vectors include 
plasmids, bacteriophages, and viral vectors. A large number of these are known in the art, 
and many are commercially available or obtainable from the scientific community. Those of 
skill in the art can select suitable vectors for use in a particular application based upon, e.g., 

20 the type of host system selected (e.g., in vitro systems, prokaryotic cells such as bacteria, 
and eukaryotic cells such as yeast or mammalian cells) and the expression conditions 
selected. 

Expression vectors within the invention can include a stretch of nucleotides that 

encodes a target polypeptide and a stretch of nucleotides that operate as a regulatory 
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domain that modulates or controls expression (e.g., transcription) of nucleotide sequences 
within the vector. For example, the regulatory domain can be a promoter or an enhancer. 

Expression vectors within the invention can include nucleotide sequences that 
encode a peptide that facilitates screening of the cyclized form of the target peptide or 
5 splicing intermediate for a particular characteristic (e.g., an affinity tag such as a chitin- 

binding domain or a biotin tag; a colored or light-emitting label; a radioactive tag; etc. ), or 
purifying the cyclized form of the target peptide or splicing intermediate from a host system 
(e.g., an affinity tag such as a chitin-binding domain, a biotin tag, a colored or light-emitting 
label; a radioactive tag; etc. ). 

10 In preferred embodiments, the expression vectors within the invention are produced 

with restriction sites both between and within the nucleic acid sequences that encode the 
split intein portions to enable the cloning of a wide variety of cyclization targets or splicing 
intermediates. In some embodiments, an expression vector of the invention can be an 
inducible expression vector, such as an arabinose inducible vector. Such vectors can be 

15 utilized to control expression of cyclization precursors or splicing intermediates within a host 
system. Other vectors can be selected for use in the invention based on their compatibility 
with known bacterial expression strains and hybrid systems. See, e.g., Zhang et al, Curr. 
Biol. 9:417, 1999; Pellitier et al, Nat. Biotechnol. 17:683, 1999; Karimova et al, Proc. 
Natl. Acad. Sci. USA 95:5752, 1998; Dmitrova et al, Mol. Gen. Genet., 257:205, 1998; 

20 Xu et al, Proc. Natl. Acad. Sci. USA 96:151, 1999; Rossi et al, Proc. Natl. Acad. Sci. 
USA 94:8405, 1997. 



Polypeptides 

Polypeptides within the invention include any that can be produced by expression of 
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a 

nucleic acid of the invention. For example, a substantially pure precursor polypeptide that 
has a target peptide (or a peptide to be displayed by a splicing intermediate) interposed 
between the first portion of a split intein and the second portion of a split intein is included in 
the invention. In some embodiments of the precursor polypeptide, the target peptide may 
be directly fused to the first and second intein portions. The precursor polypeptide 
spontaneously splices in the host system to yield a cyclized form of the target peptide (or a 
splicing intermediate displaying a peptide). 

Cyclized forms of target peptides and splicing inter-mediates displaying peptides 
are also within the invention. Preferably, these are produced by splicing of a precursor 
polypeptide of the invention. The cyclized forms of target peptides can be of any amino 
acid sequence that can be cyclized by the methods of the invention The splicing 
intermediate can be an active intein intermediate, a thioester intermediate, or a lariat 
intermediate, and can display a peptide of any compatible amino acid sequence. 

Host Systems 

Hosts systems that may be used in the invention include any systems that 

support transcription, translation, and/or replication of a nucleic acid molecule of the 

invention; or that support post-translational modification (e.g., splicing) of a polypeptide or 

protein of the invention. Numerous such hosts systems are known. For example, in the 

invention, especially when it is desired to avoid artifacts or interference caused by living 

host systems, the host system can take the form of an in vitro transcription/translation 

system. Such systems can be fabricated in the laboratory according to published 

techniques or can be commercially purchased. For instance, STP2-T7 (cat. No. 69950-3) 
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and STP-SP6 (cat. No. 69997-3) are available from Novagen (Madison, WI). Promega 
(Madison, WI) also sells such systems (e.g., cat. Nos. LI 170, L2080, L4600, L4610, 
L4130, L4140, LI 130, L1020, and L1030), as does Stratagene (La Jolla, CA) which 
markets a system branded IN VITRO EXPRESS (cat. No. 200360). Non-living host 
5 systems for use in the invention can also be derived from a living organism. For example, a 
cell lysate such as a reticulocyte lysate can be used in some applications. 

Host systems can also take the form of living organisms. Living organisms are 
preferred for host systems because they can usually be reproduced in numerous copies 
thereby providing a continuous, readily-expandible, and easily-manipulated source of 

10 selected nucleic acid molecules. Living organisms that can be used as host systems within 
the invention include prokaryotes such as bacteria (e.g., Escherichia coli) and eukaryotes 
such as yeasts and mammalian (e.g., human, murine, bovine, ovine, porcine, etc.) cells. 
Archaebacteria, plant cells, and any other organism suitable for use with the methods of the 
invention can also function as the host system. 

15 The particular host system best suited for a particular application will vary 

depending on the many different factors. One of skill in the art, however, should be able to 
select a suitable host system for a particular application based on known applications of the 
different host systems. For example, where large scale production of a cyclic peptide is 
desired, a bacterial host or an insect host would be suitable. As another example, where it 

20 is desired to analyze the interaction of human cell components, using a human cell as the 
host system would likely be more appropriate than using a bacterial system. 

Method of Making a Polypeptide. Cyclic Peptide, or a Splicing Intermediate 

The polypeptides of the invention can be prepared by conventional methods of 
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producing polypeptides of a known amino acid sequence. For example, polypeptides 
within the invention can be made by solid phase synthesis using commercially available 
equipment and reagents. Known, in vitro methods of producing cyclic peptides can also 
be used to produce cyclic peptides. In many cases, however, the polypeptides of the 
5 invention are preferably produced by expressing nucleic acid molecules encoding them in a 
host system. For example, nucleic acid molecules within the invention can be incorporated 
into an expression vector and then introduced into a host system. The host system can then 
be placed under conditions that cause the vector to be expressed, resulting in the formation 
of a precursor peptide and subsequently a cyclized form of the target peptide or a splicing 

1 0 intermediate displaying the peptide. 

A preferred method for making a cyclic peptide or a splicing intermediate includes 
the steps of: (a) providing an isolated nucleic acid molecule that encodes a polypeptide 
having a target peptide interposed between the first portion of a split intein and the second 
portion of a split intein; (b) providing a host system; (c) introducing the isolated nucleic acid 

1 5 molecule into the host system; and (d) expressing the isolated nucleic acid molecule. 

Expression of the nucleic acid molecule in the host system produces the peptide molecule in 
the form of a splicing intermediate of a cyclized form of the target peptide, or a polypeptide 
that spontaneously splices to yield a cyclized form of the target peptide. 

In preferred embodiments of this method, production of the polypeptides, cyclic 

20 peptides, or splicing intermediates takes place in vivo (e.g., with a living host system) and in 

the absence of any exogenously-added agent, such as an agent to catalyze cyclization of a 

peptide (e.g., a protease or a thiol). 

Production of polypeptides, cyclic peptides, or splicing intermediates can be 

monitored using standard techniques for characterizing proteins. See, e.g., Sambrook et al, 
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supra. Exemplary techniques that can be used include conventional chromatography, 
HPLC, FPLC and the like; electrophoresis such as sodium dodecyl sulfate polyacrylamide 
gel electrophoresis (SDS/PAGE), 2-dimensional gel electrophoresis; electromagnetic 
radiation-based-spectroscopy, mass spectroscopy; analysis of enzymatic digestion 
5 products; thermostability assays; etc. 



Purification of Polypeptides, Cyclic Peptides, or a Splicing Intermediates of the Invention 

Conventional methods of purifying proteins can be adapted to purify the 
polypeptides, cyclic peptides, and splicing intermediates of the invention. The invention 
also includes a preferred method for purifying a cyclic peptide from a mixture. In this 

10 method, an affinity tag is attached to the cyclic peptide to aid in its purification. This 

method includes the steps of: (a) providing a mixture containing a cyclic peptide conjugated 
with an affinity tag; (b) mixing the conjugated cyclic peptide with a solid phase support 
having a ligand thereon that specifically binds the affinity tag so that the support becomes 
specifically bound with the cyclic peptide; (c) washing the support to remove non- 

15 specifically bound matter, and (d) eluting the cyclic peptide from the support. 

In this method, the affinity tag can any molecule that can bind a ligand on a solid 
phase support. For example, the affinity tag can be a chitin-binding domain where the 
ligand is chitin (see examples section below) or it can be a biotin tag where the ligand is 
streptavidin. Many other affinity tag-ligand pairs are known and can be used in the 

20 invention. Because the affinity tag specifically binds the ligand on the solid phase support, 

the cyclic peptides (with the attached affinity tag) will specifically bind the support. The 

support can then be washed with a buffer (e.g. a high salt, acid or alkaline buffer) that 

removes matter within the mixture that is non-specifically bound to the support. The 
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affinity-tagged cyclic peptide can then be eluted from the solid phase support using a buffer 
containing a substance that separates the tag from the ligand (e.g.,a competitive inhibitor 
such as excess unconjugated affinity tag; or a denaturing agent), or an enzyme or chemical 
reactant that cleaves the cyclic peptide from the affinity tag. 
5 In an analogous manner, splicing intermediates rather than cyclic peptides can be 

purified. Cyclic peptides can also be purified from a mixture using splicing intermediates. 
For example, a method for purifying a cyclic peptide from a mixture includes the steps of: 
(a) providing a mixture containing a splicing intermediate conjugated with an affinity tag; (b) 
mixing the conjugated splicing intermediate with a solid phase support having a ligand 

10 thereon that specifically binds the affinity tag such that the support becomes specifically 
bound with the splicing intermediate; (c) washing the support to remove non-specifically 
bound matter; (d) adding to the support a reagent that makes a cyclic peptide from the 
splicing intermediate; and (e) eluting the cyclic peptide from the support. In a variation of 
the foregoing, steps (d) and (e) are reversed so that step (d) is eluting the splicing 

15 intermediate from the support and step (e) is adding to the eluted splicing intermediate a 
reagent that makes a cyclic peptide from the splicing intermediate. Reagents that may be 
added to make a cyclic peptide from a splicing intermediate include thiols, proteases, and 
other substances which can catalyze cyclization of the splicing intermediate. 

As a specific example, by fusing Ic to an affinity tag and removing the essential 

20 asparagine residue (see d in FIG. 3), a cyclic ester can be immobilized on an affinity 

column. The resulting cyclic peptide column can be used for the affinity purification of the 
cyclic peptide itself. A wide range of proteolytic methods can be employed to liberate the 
cyclic ester from the affini ty tag and I c depending upon the sequence of the cyclic peptide 
product. 
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Referring now to FIG. 8, a method for purifying cyclic peptides is shown. In this 
method, an active intein intermediate (specie 1) is mutagenized to replace the catalytic 
asparagine (step A) with a non-catalytic amino acid (Y) and to introduce an affinity tag 
downstream of I N (step B) to yield specie 2. The intein-mediated cyclization reaction will 
5 proceed until the lariat intermediate is formed (step C). This molecule is then passed 

through an affinity column (step D) having a solid phase support with a ligand thereon that 
specifically binds the affinity tag and thus allows retention and purification of the 1 N /I C non- 
covalent complex (specie 3). The I N /I C reaction is then disrupted to yield a lariat 
intermediate (specie 4) which can be eluted from the affinity column. Proteolytic or 
10 chemical cleavage at amino acid Y (step F) liberates the lactone intermediate (specie 5). 
Acyl-to-N rearrangement (step G) yields the thermodynamically preferred amide cyclic 
product (specie 6). 

Referring now to FIG. 9, a variation of the foregoing method for purifying cyclic 
peptides is shown. In this method, an active intein intermediate (specie 1) is mutagenized 

15 to replace the catalytic asparagine (step A) with a non-catalytic amino acid (Y) and to 
introduce an affinity tag upstream of I c (step B) to yield specie 2. The intein-mediated 
cyclization reaction will proceed until a lariat intermediate is formed (step C). This molecule 
is then passed though an affinity column (step D) having a solid phase support with a ligand 
thereon that specifically binds the affinity tag/I c intermediate (specie 3). Separation of the 

20 affinity tag (step E) from the ligand (e.g., using a molecule that competitively inhibits the tag- 

ligand interaction, using a high salt buffer or denaturing agent, or using a chemical reagent or 

protease to cleave the tag) allows recovery of the lariat intermediate (specie 4). Proteolytic 

or chemical cleavage at amino acid Y (step F) liberates the lactone intermediate (specie 5). 

Acyl-to-N rearrangement (step G) yields the thermodynamically preferred amide cyclic 
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product (specie 6). 

Method for Preparing a Library of Cyclic Peptides and Splicing Intermediates 

Numerous methods of making linear peptide libraries are known in the art. 

Modifications of such known methods can be utilized with the methods of producing cyclic 

peptides and splicing intermediates taught herein to generate libraries of cyclic peptides and 

splicing intermediates. In general, a method of preparing a library of cyclic peptides and/or 

splicing intermediates includes the steps of: (a) providing a plurality of nucleic acid 

molecules encoding a plurality of target peptides having heterogenous amino acid 

sequences; (b) incorporating each of the plurality of nucleic acid molecules into an 

expression vector to form a plurality of expression vectors, whereby each of the plurality of 

nucleic acid molecules is interposed between a nucleic acid molecule encoding a first 

portion of a split intein and a nucleic acid molecule encoding an second portion of a split 

intein in each of the formed expression vectors, such that expression of the expression 

vectors in a host system results in the production of a plurality of splicing intermediates of 

cyclized forms of the target peptides or polypeptides that spontaneously splice in the host 

system to yield cyclized forms of the target peptides; and (c) expressing the expression 

vectors in the host system. 

As more specific examples, the methods described in Childs et al, in Sequence 

Specificity in Transcription and Translation (Alan R. Liss, Inc., 1985) and the double strand 

ligation method described in Schumacher et al, Science 271:1854, 1996 can be modified 

for use in the current invention. Known PCR-based methods can also be used to generate 

polynucleotides encoding peptides with random sequences that can be circularized or 

expressed as splicing intermediates in the invention. See, e.g., Caldwell and Joyce, PCR 
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Methods Appl. 2:22, 1992; Ostermeier et al, Proc. Natl. Acad. Sci. USA 96:3562, 1999; 
the Nested Deletion Protocol and Reagents from Promega; and Stemmer, W.P. Nature 
370:389, 1994 (DNA shuffling). The plurality of polynucleotides encoding peptides with 
heterogenous sequences can be incorporated as the target peptide (or the peptide to be 
5 displayed in a splicing intermediate) in the nucleic acid molecules and expression vectors of 
the invention as described above and then expressed in a host system to make a library of 
cyclic peptides or splicing intermediates. 



Method for Screening a Cyclic Peptide for a Predetermined Characteristic 

Myriad techniques exist for screening small molecules for particular characteristics. 

10 See, e.g., Fernades, P., Current Opin. Chem. Biol. 2:597, 1998; Science 286:1759, 1999; 
U.S. Patent Nos. 5,585,277 and 5,989,814. More specifically, methods for determining 
which peptide in a combinatorial peptide library binds specifically to a target protein are 
also known. E.g., U.S. Patent No. 5,834,318. Many of these methods can be adapted to 
screen cyclic peptides and/or splicing intermediates made with the methods of the invention 

1 5 for particular characteristics . 

A general method of screening a peptide molecule for a predetermined 
characteristic includes the steps of: (a) providing a nucleic acid molecule that encodes a 
polypeptide having a target peptide interposed between a first portion of a split intein and a 
second portion of a split intein such that expression of the nucleic acid molecule in a host 

20 system produces the peptide molecule either as a cyclized form of the target peptide (as a 

result of spontaneously splicing of the polypeptide in the host system) or a splicing 

intermediate of a cyclized form of the target peptide; (b) providing the host system; (c) 

introducing the isolated nucleic acid molecule in the host system; (d) placing the host system 
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under conditions that cause the peptide molecule to be produced; and (e) testing the 
peptide molecule for the predetermined characteristic. 

Step (a) can be performed as described elsewhere herein by, for example, using 
molecular biology techniques (see, Ausubel et al and Sambrook et al, supra) to produce 
5 poly-nucleotides encoding the target peptide, the first portion of an intein, and the second 
portion of an intein. The resulting three polynucleotides can then be fused (e.g., ligated) 
together to form the nucleic acid molecule. The host system provided can be any of those 
described herein in which the nucleic acid molecule can be expressed (e.g., a bacterium, a 
yeast, a mammalian cell, etc.). The nucleic acid can be introduced into the host system by 

10 known methods depending on the form of the nucleic acid molecule and the host system 
used. For example, the nucleic acid molecules can be introduced into a cell by 
electroporation, lipofection, using calcium chloride-mediated transformation, using a gene 
"gun," using a bacteriophage vector (when host system is a bacterium), using a plasmid 
construct, using a viral vector, etc. 

1 5 The host system can be placed under conditions that cause the peptide molecule to 

be produced by adjusting the conditions according to the particular form of the nucleic acid 
molecule and the host system used. For a human cell host system, this can mean placing 
the cell in an appropriate nutrient rich medium and culturing the cell in a 37°C, humidified, 
5-10% C0 2 incubator. For inducible expression vectors, this can mean adding the 

20 substance to the host system that induces expression of the nucleic acid molecule in the 

vector. For example, when using an arabinose-inducible expression vector, this step can 

include adding arabinose to the host system. 

Testing of the peptide molecule for the predetermined characteristic can be 

performed by a large number of different methods, e.g., measuring binding of the peptide 
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molecule to a known ligand and analyzing the ability of the peptide molecule to modulate 
(i.e., increase or decrease the rate of) a biochemical reaction. For a description of various 
methods that may be used for testing peptides for predetermined characteristics see, 
Fernades, P., supra. 

5 More specific exemplary methods that may be used to screen cyclic peptides or 

splicing intermediates for a particular characteristic include using a solid phase support and 
affinity chromatography to identify molecules which specifically bind cyclic peptides or 
splicing intermediates; using phage display technology; and using aptamer peptide fusion 
constructs and/or hybrid systems to identify cyclic peptides or splicing intermediates that 
10 can modulate a specific biochemical reaction or intracellular event. 



Solid Phase Supports/Affinity Chromatography For Identifying Molecules That 
Interact with Cyclic Peptides and/or Splicing Intermediates 

Cyclic peptides or splicing intermediates can be immobilized on a solid phase 

support to facilitate identification and/or purification of molecules that specifically bind a 

15 given cyclic peptide or splicing intermediate. For a general overview of peptide affinity 
columns for purification see Bumbach, G.A. and D.J. Hammond, Biopharm., 5:24, 1992. 
Below are examples of how this can be performed in the invention. 

Referring now to FIG. 10, a method for identifying/purifying molecules that 
specifically bind a given splicing intermediate is shown. In this method, an active intein 

20 intermediate (specie 1) is mutagenized to replace the catalytic asparagine (step A) with a 

non-catalytic amino acid (Y) and to introduce an affinity tag upstream of I c (step B) to yield 

specie 2. The intein-mediated cyclization reaction will proceed until a lariat intermediate is 

formed (step C). This molecule is then passed though an affinity column (step D) having a 

solid phase support with a ligand thereon that specifically binds the affinity tag/Ic 
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intermediate (specie 3). A solution containing target molecules (i.e., candidates for binding 
the splicing intermediates) is then passed through the column (step E). Target molecules 
that specifically bind the splicing intermediate are selectively retained within the column. 
These target molecules can be removed from the column and biochemically analyzed (e.g., 
5 sequenced). 

Referring now to FIG. 11, another method for identifying/purifying molecules 
which specifically bind a given splicing intermediate is shown. In this method; an active 
intein intermediate (specie 1) is mutagenized to replace the catalytic asparagine (step A) 
with a non-catalytic amino acid (Y) and to introduce an affinity tag downstream of I N (step 

10 B) to yield specie 2. The intein-mediated cyclization reaction will proceed until a lariat 
intermediate is formed (step C). This molecule is then passed though an affinity column 
(step D) having a solid phase support with a ligand thereon that specifically binds the affinity 
tag to allow the retention and purification of the I N /Ic non-covalent complex (specie 3). 
Cleavage of the affinity tag (step E) allows recovery of the lariat intermediate (specie 4). A 

15 solution containing target molecules (i.e., candidates for binding the splicing intermediates) is 
then passed through the column (step E). Target molecules that specifically bind the 
splicing intermediate are selectively retained within the column. 

Referring now to FIG. 12, another method for identifying/purifying molecules that 
specifically bind a given splicing intermediate is shown. In this method, an active intein 

20 intermediate (specie 1) is mutagenized to replace the I N nucleophile (step A) with a non- 
catalytic amino acid (Z) and to introduce an affinity tag downstream of I N (step B) to yield 
an I c -peptide-I N -tag fusion protein (specie 2). The intein-mediated cyclization reaction will 
produce the fusion protein (step C). This protein is then passed though an affinity column 

(step D) having a solid phase support with a ligand thereon that specifically binds the affinity 
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tag to allow the retention and purification of the protein complex (specie 3). A solution 
containing target molecules (i.e., candidates for binding the splicing intermediates) is then 
passed through the column (step E). Target molecules that specifically bind the splicing 
intermediate are selectively retained within the column. 

Referring now to FIG. 13, yet another method for identifying/purifying molecules 
which specifically bind a given splicing intermediate is shown. In this method, an active 
intein intermediate (specie 1) is mutagenized to replace the I N nucleophile (step A) with a 
non-catalytic amino acid (Z) and to introduce an affinity tag upstream of Ic (step B) to yield 
an tag-I c -peptide-I N fusion protein (specie 2). The intein-mediated cyclization reaction will 
produce the fusion protein (step C). This protein is then passed though an affinity column 
(step D) having a solid phase support with a ligand thereon that specifically binds the affinity 
tag to allow the retention and purification of the protein complex (specie 3). A solution 
containing target molecules (i.e., candidates for binding the splicing intermediates) is then 
passed through the column (step E). Target molecules that specifically bind the splicing 
intermediate are selectively retained within the column. 

Phage Display 

Methods of screening molecules using phage display are also within the invention. 

Conventional methods using phage display can be modified by using the phage to display 

the cyclic peptides and/or splicing intermediates of the invention. For example, if Z in FIG. 

1 is a phage coat protein and XH=H in Fig. 2, the splicing reaction will not progress 

beyond the first ester intermediate, thus resulting in the target peptide being displayed as a 

loop. In this manner, libraries comprising phage particles displaying loop target peptides 

can be prepared and used to pan for molecules that bind the displayed loop. For instance, 
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a target molecule can immobilized on a solid phase support. Phage libraries displaying 
different looped peptides can then be mixed with the support. Those phage displaying 
looped peptides that bind the target molecule would be selectively retained on the support. 
After elution from the support (e.g., by cleavage of the ester linkage of the phage-displayed 
loop peptides with high concentrations of a potent nucleophile), the amino acid sequences 
of the looped peptides can be determined by standard molecular biology methods. 

Aptamers 

Peptides aptamers are polypeptides that contain a conformationally constrained 

target^ 

peptide region of variable sequence displayed from a scaffold. Since cyclic peptides or 

splicing intermediates can function as aptamers, known methods of analyzing aptamers can 

be modified to assist in identifying particular characteristics of the cyclic peptides or splicing 

intermediates. See, e.g., Geyer et al, Proc. Natl. Acad. Sci. USA 96:8567, 1999; 

Caponigro et al, Proc. Natl. Acad. Sci. USA 95:7508, 1998; Mikhail et al, Proc. Natl. 

Acad. Sci. USA 95:14266, 1998; Norman et al, Science 285:591, 1999. 

For example, referring to FIG. 14, cyclic proteins can be used as aptamer scaffolds 

in a technique that allows members of a peptide library to be displayed as a constrained 

loop between the N-terminus and C-terminus of the cyclic protein scaffold. As shown in 

FIG. 14, an aptamer library can be expressed as an I c -scafFold-I N fusion protein (specie 

1). Procession of the intein-mediated cyclization reaction in vivo (step A) yields an I N , an 

I c , and a cyclic scaffold protein (specie 2). The aptamer library is displayed in the linker 

region between the N-terminus and C-terminus. In FIG. 14, N represents any amino acid 

and the subscripts n and m are any integral number equal or greater than 0, and X 

SUBSTITUTE SHEET (RULE 26) 



WO 00/36093 PCT/US99/30162 

39 

represents serine, threonine, or cysteine. 

Other examples of using aptamers are also within the invention. Referring to FIG. 
15, two such methods are described. In reaction I, an active intein intermediate is 
mutagenized to replace the nucleophilic amino acid from I N (step A) with a non-catalytic 
5 amino acid (Z). These processes inactivate the splicing reaction to yield specie 2. Because 
of the strong interaction between Ic and I N , this technique allows the members of a peptide 
library to be displayed as a constrained loop in the linker region between the two intein 
portions (Target). In reaction II, an active intein intermediate is mutagenized to replace the 
catalytic asparagine (step A) with a non-catalytic amino acid (Y) to yield specie 2. 
10 Progression of the intein-mediated cyclization reaction proceeds in vivo (step B) and arrests 
at the lariat intermediate stage (specie 3), allowing members of a peptide library to be 
displayed as a constrained lactone. 



Hybrid Systems 

The yeast two-hybrid system is a well-studied method for analyzing in vivo protein- 

15 protein interactions. Fields, S. and O. Song, Nature (London) 340:245, 1989. It and 

variations thereof such as one-hybrid systems, three-hybrid systems, reverse two-hybrid 

system, split-hybrid system, alternative n-hybrid systems, small molecule-based hybrid 

systems can be used to analyze the characteristics of cyclic peptides and/or splicing 

intermediates by adapting known methods. See, e.g., Drees, B. L., Current Opin. Chem. 

20 Biol., 3:64, 1999; Vidal, M., and P. Legrain, Nucleic Acids Research, 27:919, 1999; 

Current Protocols in Molecular Biology, eds., Ausubel, F. M., et al, Wiley, New York, 

1996; Huang, J. and S.L. Schreiber, Proc. Natl. Acad. Sci. (USA) 94:13396, 1997; Yang, 

M, et al, Nucleic Acids Res. 23:1152, 1995; Colas, P., et al, Nature (London), 380:548, 
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1996; Xu, C.W., et al, Proc. Natl. Acad. Sci. (USA) 94:12473, 1997. 

For example, referring to FIG, 16, a method of identifying a target protein that 
interacts with a splicing intermediate is within the invention. In this method, an active intein 
intermediate (specie 1) is mutagenized to replace the catalytic asparagine (step A) with a 
5 non-catalytic amino acid (Y) and to introduce a DNA-binding domain downstream of I N 
(step B) to yield specie 2. The intein-mediated cyclization reaction will proceed until the 
lariat intermediate (specie 3) is formed (step C). I N and I c form a strong non-covalent 
complex. The resulting lariat intermediate is then co-expressed with a target protein 
■i attached to a DNA-binding domain (step D). Interaction of the lariat intermediate with the 

10 target protein (specie 4) causes activation of a promoter region (step E) leading to 

expression of the reporter gene (*). This method allows identification of target molecules 
able to bind the lariat intermediate. This method can be modified such that a known 
molecule (in place of an unknown target protein) is attached to a DNA-binding domain, so 
that lariat intermediates displaying a looped peptide that binds the known molecule can be 

15 identified. 

Referring to FIG. 17, another method for identifying a target protein that interacts 

with a splicing intermediate is described. In this method, an active intein intermediate 

(specie 1) is mutagenized to replace the catalytic asparagine (step A) with a non-catalytic 

amino acid (Y) and to introduce a DNA-binding domain upstream of Iq (step B) to yield 

20 specie 2. The intein-mediated cyclization reaction will proceed until the lariat intermediate 

(specie 3) is formed (step C). This molecule is then co-expressed with a target protein 

attached to a DNA-binding domain (step D). Interaction of the lariat intermediate with the 

target protein (specie 4) causes activation of a promoter region (step E) leading to 

expression of the reporter gene (*). This method allows identification of target molecules 
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able to bind the lariat intermediate. This method can be modified such that a known 
molecule (in place of an unknown target protein) is attached to a DNA-binding domain, so 
that lariat intermediates displaying a looped peptide that binds the known molecule can be 
identified. 

5 Referring now to FIG. 1 8, another method for identifying a target protein that 

interacts with a splicing intermediate is described. In this method, an active intein 
intermediate (specie 1) is mutagenized to replace the I N nucleophile (step A) with a non- 
catalytic amino acid (Z) and to introduce a DNA-binding domain (DBD) downstream of I N 
(step B). These processes will inactivate the splicing reaction and will generate an I c - 
10 peptide-I N -DBD fusion protein (specie 2). This molecule is then co-expressed with a target 
protein attached to a DNA-binding domain (step C). Interaction of the fusion protein with 
the target protein (specie 3) causes activation of a promoter region (step D) leading to 
expression of the reporter gene (*). This method allows identification of target molecules 
able to bind the fusion protein . This method can be modified such that a known molecule 
15 (in place of an unknown target protein) is attached to a DNA-binding domain, so that fusion 
proteins displaying a looped peptide that binds the known molecule can be identified. 

Referring to FIG. 1 9, yet another method for identifying a target protein that 
interacts with a splicing intermediate is described. In this method, an active intein 
intermediate (specie 1) is mutagenized to replace the I N nucleophile (step A) with a non- 
20 catalytic amino acid (Z) and to introduce a DNA-binding domain upstream of I c (step B). 

These processes will inactivate the splicing reaction and will generate a DBD-I c -peptide-I N - 
fusion protein (specie 2). This molecule is then co-expressed with a target protein attached 
to a DNA-binding domain (step C). Interaction of the fusion protein with the target protein 

(specie 3) in step D causes activation of a promoter region (step E) leading to expression of 
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the reporter gene (*). This method allows identification of target molecules able to bind the 
fusion protein . This method can be modified such that a known molecule (in place of an 
unknown target protein) is attached to a DNA-binding domain, so that fusion proteins 
displaying a looped peptide that binds the known molecule can be identified. 

5 

Targeting Cyclic Peptides and Splicing Intermediates In Vivo 

The cyclic peptides and splicing intermediates within the invention can be 
specifically targeted to particular cellular locales or for extracellular secretion by using 
modifications of targeting methods known in the art . See, e.g., Wilkinson et al, J. 

10 Membrane Biol., 155:189, 1997; Komiya et al, The EMBO J., 17:3886, 1998; Kouranov 
and Schnell, J. Biol. Chem. 271:31009, 1996; Bhagwat et al, J. Biol. Chem., 274:24014, 
1999; Adam, S.A., Current Opin. Cell Biol. 11:402-406, 1999; Gorlich, D., Current Opin. 
Cell Biol. 9:412, 1997; Pemberton et al, Current Opin. Cell Biol. 10:292, 1998; 
Sakaguchi, M., Current Opin. Cell Biol. 8:595, 1997; Folsch et al., The EMBO J. 

15 17:6508, 1998. For example, various signal peptides can be attached to the cyclic 

peptides or splicing intermediates of the invention to cause them to localize to 
predetermined cellular compartments or to be secreted into the extracellular space after 
translation. In this manner, the cyclic peptides or splicing intermediates of the invention can 
be targeted to cellular locales such as mitochondria, lysosomes, endoplasmic reticula, 

20 chloroplasts, golgi, periplasm, the nucleus, the plasma membrane. This method for targeting 
can also be used in the methods for generating a peptide libraries and methods of screening 
such libraries where it is desired to identify molecules that interact or exhibit an activity at 
predetermined cellular locations. 
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Examples 

Preparation of Cyclic Dihydrofolate Reductase (DHFR) and Cyclic Pseudostellarin F 
Materials and Methods 

Vector Construction 

5 The gene for the Ssp DnaE N-intein (I N ) was amplified from Ssp 6803 genomic 

DNA with Tag polymerase and primers introducing S'-BgRl and Nsil and 3 '-PstI restriction 
sites. The Ssp DnaE I c gene was amplified similarly with primers introducing 5'-Nco\ and 
y-Nde\ and Sad restriction sites. Plasmid pDIMCP resulted from individually cloning the 
intein fragments into pDIMC7 [identical to pDIMC6 (see Ostermeier et aL, Proc. Natl. 

10 Acad. Sci. USA 96:3562, 1999) except for conversion of a BamHl restriction site into 
BglH]. An alanine to histidine mutation in the Ic gene (A35H) was affected by Quick- 
Change mutagenesis (Stratagene) resulting in pDIMCPAH. Excision of the intein fragments 
as an NcoVPstl digest and ligation into pAR4 [derived from pAR3 (Perez-Perez, J. and J. 
Gutierrez, Gene, 158:141, 1995; American Type Culture Collection (ATCC) #87026) with 

15 a unique Nco\ in the multiple cloning site] generated pARCP (a in Fig. 3) and pARCPAH. 

E. coli DHFR was amplified from pET22b-DHFR (Miller, G.P. and S J. Benkovic, 
Biochemistry 37:6327 1998) with primers introducing a S'-Ndel site followed by (CAC) 6 
(encoding six histidine residues) and a y-Psil site, digested with NdeVPstX and ligated into 
NdeVNsil digested pARCP or pARCPAH to produce pARCP-DHFR (b in Fig. 3) and 

20 pARCPAH-DHFR ( c in Fig. 3). A polyhistidine sequence was prepared synthetically with 

NdeU Nsil and BspMl sites, and ligated into pARCPAH to produce plasmid pARCP2-6H 

which encodes cyclo-[CHMHHHHHHGAGAA]. Plasmid pARCP-p was produced in 

three steps from pDIMCPAH: (1) Quick-Change mutagenesis introduced an A/Ill site into 

I N , generating pDIMCPMA; (2) the pseudostellarin F gene was synthetically prepared and 
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ligated into MfeVAflll digested pDIMCPMA to produce pDIMCP-p; and (3) the fusion 
construct was excised from pDIMCP-p as an NcoVPstl fragment and ligated into NcoVPstl 
digested pAR4 to produce pARCP-p (e in Fig. 3). To produce plasmid pARCBD-p, a 
Kpnl site was introduced at the carboxyl terminus of the I N gene of pARCP-p by Quick- 
5 Change mutagenesis to produce pARCPpK. The gene encoding the chitin binding domain 
was amplified from plasmid pCYBI (New England Biolabs, Inc., Beverly, Massachusetts) 
with primers introducing a 5 f Kpnl site and a 3* /ft'ndlll site. Both the PCR product and 
pARCPpK were digested with Kpnl and Hin&lW and ligated together to generate 
pARCBD-p (fin Fig. 3). All enzymes were from Promega or New England Biolabs unless 
10 otherwise noted. 



DHFR purification 
XL I -Blue cells harboring either pARCP-DHFR or pARCPAH-DHFR were 
grown in LB medium plus 50 ug/ml chloramphenicol at 37°C until the culture reached an 
OD 600 Of 0.7. The culture was induced with L-(+)-arabinose to a final concentration of 

15 0.5% and grown at 28°C for 24 hours. Cells were harvested by centrifugation (7,000 x g, 
10 minutes) and frozen in liquid nitrogen. The cells were lysed, and DHFR containing 
proteins were purified as described (Miller and Benkovic, id). The cyclic product was 
separated from other DHFR-containing intermediates by FPLC using a Mono-Q column 
(Amersham Pharmacia) eluted with a gradient of 0 -1 M NaCl in 50 mM Tris-HCl over 30 

20 minutes. Western blotting was performed with anti-His (Qiagen) and goat anti-mouse- 
alkaline phosphatase-conjugated antibodies (Pierce) according to the manufacturers 1 
instructions. 
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Endoproteinase Lys-C Digestion 
Wild-type or cyclic DHFR (50 ug) was treated with 0.5 ug of endoproteinase Lys- 
C in 0.1 M NH4HCO3 at 37°C. Samples were taken at 6 and 24 hours, visualized on a 
SDS/16% PAGE gel and submitted for matrix assisted, laser desorption ionization 
5 (MALDI) time-of-flight mass spectrometry (Moore, W. T., Methods Enzymol., 289:520, 
1997). 



DHFR assays 

Thermostability was assayed by preincubation of 100 nM wild type or cyclic 
DHFR at either 25°C or 65°C in MTEN buffer [50 mM 2-(N-morpholino)ethanesulfonic 
10 acid (MES), 25 mM tris(hydroxymethyl)aminomethane (Tris), 25 mM ethanolamine and 
100 mM NaCl]. Aliquots were taken at various time points and equilibrated to room 
temperature for five minutes in the presence of 100 uM 7, 8-dihydrofolate. Activity assays 
were initiated with reduced nicotinamide adenine dinucleotide (NADPH) as previously 
described (Miller and Benkovic, supra). 



15 Synthesis of cvclo-("Ser-Glv-Glv-Tvr-Leu-Pro-Pro-Leu] 

To a solution of 3.5 mg (4 umol) of NH 2 -Ser-Gly-Gly-Tyr-Leu-Pro-Pro-Leu- 
C0 2 H and 1.8 mg (16 umol) of A^hydroxysuccinimide in 20 ml of dimethylformamide was 
added 3.0 mg (16 umol) of 1 -ethyl-3-(3Kiimethylaminopropyl)carbodiimide (EDC). The 
reaction was stirred for 10 hours at 25°C. An additional 3.0 mg of EDC was then added 

20 and stirring was continued at 25°C for another 10 hours. The solvent was removed by 

rotary evaporation and the residue was dissolved in 2 ml of water for purification by 
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reversed-phase HPLC on a Whatman Partisil 10 ODS-3 9.4-mM X 50- cm column eluted 
with a linear gradient of 0-50% (vol/vol) acetonitrile in 0. 1% trifluoroacetic acid/water over 
30 minutes. The appropriate fractions were lyophilized to yield 2.8 mg (80%) of a white 
solid, [m/z 785 (Mff)]. ^-NMR and UV-visible spectra of the synthetically prepared 
5 material were consistent with published spectra for the isolated natural product (Morita, H., 
et al, Tetrahedron 50:9975, 1994). 



Pseudostellarin F Purification 
E. coli strains XLl-Blue, DH5a or BL21-DE3 harboring pARCP-p were grown 
and harvested as described for DHFR purification. The media (500 ml) was extracted 

10 three times with 1-butanol (3 x 100 ml). The extracts were combined and evaporated, and 
the solid residue was resuspended in 2 ml 0.1 M K 2 HP0 4 (pH 8.0; lysis buffer). Cells were 
resuspended in 10 ml of lysis buffer, sonicated, and clarified by centrifugation (20,000 x g, 
20 minutes). The lysate was extracted (3 x 5 ml of n - butanol), and extracts were 
combined, evaporated and resuspended in 500 ul of lysis buffer. The recombinant product 

15 was purified from lysate and media extracts by HPLC as described above. Lyophilization 
of the appropriate fractions from the lysate and media extractions yielded an oily residue, 
mix 785.47 (MH + ), 807.43 (MNa + ) and 823.44 (MK + ). ! H-NMR and UV-visible 
spectra of the recombinant material were consistent with published spectra for the isolated 
natural product (Morita et al, supra). Proteins fused to the chitin binding domain were 

20 prepared as described above through generation of the clarified lysate. The lysate was 

passed over a chitin column (New England Biolabs, Inc.) equilibrated with lysis buffer. The 

column was eluted isocratically, and fractions containing splicing intermediates were pooled 

and submitted for MALDI mass spectral analysis. 
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Tyrosinase Cloning 
The tyrosinase gene (including ORF 438) from Streptomyces antibioticus 
(Bernan et al, Gene 37:101, 1985) was amplified with Vent -polymerase from pIJ702 
(ATCC no. 35287) with primers introducing 5* Ndel and 3 f EcoKL restriction sites. The 
5 PCR product was digested with Ndel and EcoBl and ligated into similarly digested 

pDIMN2 (Ostermeier et al, supra) to generate pDIMN-Y. Transformed ligation mixtures 
were grown at ambient temperature for 5 days, and colonies that expressed tyrosinase 
were identified by pigment formation on FeCuY plates [LB agar plates containing ampicillin 
(200 ug/ml), FeCI 3 * 6H 2 0 (0.2 mM), CwS0 4 ■ 5H 2 0 (0.2 mM), L-tyrosine (0,3 mg/ml, Y) 
10 and isopropyl-B-D-galactoside (1 mM)] (Della-Cioppa, G. et al., Biotechnology 8:634, 

1990). 

Results 

Design of Genetic Constructs 
The genes encoding Ssp Ic and I N were amplified from Ssp genomic DNA by 

15 standard molecular biology methods (Sambrook, et ai, supra) and serially ligated into 

pDIMC7. The resulting cyclization precursor (CP) fragment was excised from pDIMC7 
and cloned adjacent to the AraB promoter of pAR3 to generate the pARCP vector series 
(Fig. 3). These vectors activate the expression of cyclization precursors in the presence of 
arabinose. The E, coli DHFR gene was cloned between the Ndel and Nsil sites of 

20 pARCP to create an in frame fusion with each of the split intein genes (b and c in Fig. 3). 

The PCR primer used to amplify DHFR also introduced a sequence encoding a six- 

histidine tag at the 5* end of the DHFR gene to allow immunodetection of the region to be 

cyclized. Two DHFR constructs were assembled in order to investigate the role of the 
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penultimate residue of I c in acid/base catalysis of asparagine side chain cyclization. Plasmid 
pARCP-DHFR (b in Fig. 3) encodes wild type I c , which has an alanine residue 
neighboring the terminal asparagine. Plasmid pARCPAH-DHFR ( c in Fig 3) incorporates 
an alanine-to-histidine mutation at the penultimate position in I c gene. To produce 
5 pseudostellarin F (cyclo-[SGGYLPPLJ), the vector was modified by silent mutation to 
create an Aflll site at the 5-end of the I N gene (d in Fig. 3). An Mfel site occurs naturally 
at the 3 -end of the wild-type Ic gene. Ligation of a synthetically prepared, double- 
stranded insert encoding pseudostellarin F into the modified vector produced plasmid 
pARCP-p (e in Fig. 3). A Kpn\ site was introduced at the 3-end of the I N gene in order to 
10 fuse the gene for the chitin-binding domain to the pseudostellarin-producing construct (fin 
Fig. 3). 



Production and Characterization of Cyclic DHFR 
DHFR cyclization was readily apparent by SDS-PAGE upon arabinose induction 
of pARCP-DHFR as shown in FIG. 4 (F: I C -DHFR-I N fusion protein. T: Ic-DHFR-I N 

15 fusion thioester intermediate. R: Ic-DHFR lariat intermediate. L: linear DHFR. 0: cyclic 
DHFR. I N :N-intein. 1^ C-intein. Lane 1: uninduced XLl-Blue/pARCP-DHFR. Lane 2: 
arabinose induced XLI-Blue/pARCP-DHFR. Lane 3: arabinose induced XLI- 
Blue/pARCPAH-DHFR. Lane 4: lane 3 crude lysate after methotrexate agarose. Lane 5: 
lane 4 material post FPLC. Lane 6: Wild-type DHFR). 

20 Bands with apparent molecular weights corresponding to the linear (L, 23 kDa) and 

cyclic (0, 21 kDa) DHFR products, the fusion protein (F, 37 kDa), I N (14 kDa) and Ic (4 

kDa) were clearly visible, as were bands tentatively assigned as the thioester (T, 36 kDa) 

and lariat intermediates (R, 26 kDa) (Fig. 4, lane 2). Mutation of the penultimate residue of 
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I c (A35) from alanine to histidine ( c in Fig. 3) improved the yield of cyclic DHFR (Fig. 4, 

lane 3). Methotrexate-agarose affinity chromatography of the crude lysate (Fig. 4, lane 4) 

confirmed that the majority of the induced bands contained correctly folded DHFR. 

Although I N is not covalently attached to DHFR, it was retained on the methotrexate 

column presumably due to non-covalent complex formation with the I C -DHFR lariat 

intermediate (R). The methotrexate-agarose eluant was fractionated by FPLC, allowing 

purification of 5 mg of the cyclic product per liter of culture (Fig. 4, lane 5). Western 

blotting (not shown) with an anti-His antibody demonstrated the presence of the 

polyhistidine linker sequence (d in Fig. 3) in the FPLC-purified protein. The protein 

migrated more rapidly in SDS/PAGE analyses than recombinant DHFR (Fig. 4, lane 6) 

despite the extra 1 1 -amino acid linker sequence (b in Fig. 3) implying an additional 

topological constraint. Furthermore, no reaction was detected when the FPLC-purified 

protein was reacted with phenylisothiocyanate (Edman, P., Acta Chem. Scand., 4:283, 

1950), suggesting that the amino terminus was unavailable. 

Cyclic DHFR had steady-state kinetic parameters and substrate, cofactor and 

methotrexate dissociation constants which were indistinguishable from the wild type enzyme 

at 25°C. Activity assays conducted after 65°C preincubation of wild type and cyclic 

DHFR indicated that cyclization improved the thermostability of the enzyme (Fig. 5). 

Endoproteinase Lys-C digestion was used to demonstrate unambiguously that the FPLC 

purified protein was cyclic DHFR. Digestion of the wild type enzyme produces amino- 

terminal (4.4 kDa) and carboxy-terminal (6.3 kDa) fragments; in a cyclic protein, these two 

fragments would be joined, resulting in a 10.7 kDa digestion product (Fig. 6). The FPLC 

purified material was resistant to proteolysis compared to the wild type enzyme, and mass 

spectral analysis of the digestion mixtures identified a 10.7 kDa peak in the product 
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resulting from the cyclic protein, which was absent in the wild type enzyme (data not 
shown). 

Production and Characterization of Pseudostellarin F 
Pseudostellarin F production was readily detected in vivo through inhibition of 
recombinant Streptomyces antibioticus tyrosinase (FIG. 7). In the experiment shown in 
FIG. 7, XLI-Blue cells were co-transformed with pDEM-NY and either pARCP2-6H (a & 
b) or pARCP-p ( c & d). The cells were plated on FeCuY plates with chloramphenicol 
(50 ug/ml) 5 either without (a & c) or with (b & d) L-(+)arabinose (0.5%). 

Co-expression of pseudostellarin F in tyrosinase expressing cells dramatically 
reduced pigment formation (d in Fig. 7). Expression of an unrelated cyclic peptide from 
pARCP2-6H failed to inhibit tyrosinase (a and b in Fig. 7), and inhibition absolutely 
required arabinose induction (compare c and d in Fig. 7). SDS/PAGE analysis of 
arabinose-induced pARCP-p in several bacterial strains (BL21-DE3, DH5a, and XLI- 
Blue) allowed the visualization of bands corresponding to the fusion protein (F), thioester 
intermediate (T) and I N . An intense, low molecular weight band was also visible, but the 
resolution was insufficient to separate the lariat intermediate (R) and I c (data not shown). 
Although pseudostellarin F was too small to be visualized by SDS/PAGE, mass spectral 
analysis indicated its presence in both the crude cell lysate and media. Approximately 30 
ug of the recombinant cyclic peptide was isolated from the cell lysate per gram of wet cell 
mass. Pseudostellarin F was also isolated from the media by 1-butanol extraction followed 
by HPLC with a yield that varied between 2 mg/liter (XL I Blue) and 20 mg/liter (BL21- 
DE3) depending on the expression strain. The NMR spectrum of the recombinant material 

was consistent with that reported for the natural product (Morita et al., supra), and the 
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retention time of the bacterially expressed cyclic peptide was identical to a synthetically 
prepared standard. The recombinant material failed to react with ninhydrin (see, Gordon, 
A J., and R.A. Ford, The Chemist's Companion: A Handbook of Practical Data, 
Techniques, and References, Wiley Interscience, New York, 1972), indicating a backbone 
5 cyclic peptide (lactam) rather than a lactone product. Neither HPLC nor mass spectral 
analysis provided any evidence for production of the linear parent peptide. 

A chitin-binding domain was fused to the carboxy-terminus of I N to affinity-purify 
intermediates of the intein-mediated ligation reaction and characterize them by MALDI 
mass spectrometry (Table 1). 
10 Mass, Da 



Reaction Linear Cyclic Observed 

Component 



F, T 


24,380.5 


NA 


24,380.4 


In 


19,642.0 


NA 


19,642.3 


R 


4,756.5 


4,738.5 


4,756.2 


Ic 


3,969.2 


3,951.2 


3,953.0 


Pseudostellarin F 


802.4 


784.4 


784.4 



Table 1: Mass spectral characterization of pseudostellarin F cyclization intermediates 

20 All of the intermediates of the splicing reaction, including I c , were retained when the crude 

cell lysate from arabinose-induced pARCBD-p in XLl-Blue was passed over a chitin 

affinity column. Pseudostellarin F was recovered from the unretained material by 1-butanol 

extraction. The observed molecular masses for the fusion protein (F), the thioester 

intermediate (T) and I N were in excellent agreement with the values predicted from the gene 
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sequence. The mass of I c was consistent with the asparagine-cyclized form as predicted 
from the proposed mechanism of product release. The molecular mass of the lariat 
intermediate (R) was more consistent with the linear I c -pseudostellarin F fusion product 
than the branched lactone product expected from the transesterification reaction. 

5 Other Embodiments 

While the above specification contains many specifics, these should not be construed 
as limitations on the scope of the invention, but rather as examples of preferred embodiments 
thereof. Many other variations are possible. Accordingly, the scope of the invention should 
be determined not by the embodiments illustrated, but by the appended claims and their legal 
10 equivalents. 
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1 1 . A non-naturally occurring nucleic acid molecule encoding a polypeptide 

2 comprising a first portion of a split intein, a second portion of a split intein, and a target 

3 peptide interposed between the first portion of a split intein and the second portion of a 

4 split intein; 

5 wherein expression of the nucleic acid molecule in a host system produces the 

6 polypeptide in a form selected from the group consisting of: (a) a polypeptide that 

7 spontaneously splices in the host system to yield a cyclized form of the target peptide, and = 

8 (b) a splicing intermediate of a cyclized form of the target peptide. 

9 2. The non-naturally occurring nucleic acid molecule of claim 1 , wherein the 

10 polypeptide is a polypeptide that spontaneously splices in the host system to yield a cyclized 

1 1 form of the target peptide. 

12 3 . The non-naturally occurring nucleic acid molecule of claim 1 , wherein the 

13 polypeptide is a splicing intermediate of a cyclized form of the target peptide. 

14 4. The non-naturally occurring nucleic acid molecule of claim 1 , wherein 

1 5 both the first portion of a split intein and the second portion of a split intein are derived 

1 6 from a naturally-occurring split intein. 

17 5 . The non-naturally occurring nucleic acid molecule of claim 4, wherein 

1 8 both the first portion of a split intein and the second portion of a split intein are derived fron 
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19 SspDnaE. 



20 6. The non-naturally occurring nucleic acid molecule of claim 1, wherein 

2 1 at least one of the first portion of a split intein and the second portion of a split intein is 

22 derived from a non-naturally occurring split intein. 

23 7. The non-naturally occurring nucleic acid molecule of claim 6, wherein the 

24 non-naturally occurring split intein is derived from the group consisting of RecA, DnaB, Psp 

25 Pol-I, and Pfu inteins. 

26 8. The non-naturally occurring nucleic acid molecule of claim 1 , wherein 

27 both the first portion of a split intein and the second portion of a split intein are derived from 

28 a non-naturally occurring split intein. 



29 9. The non-naturally occurring nucleic acid molecule of claim 3, wherein the 

30 splicing intermediate is an active intein intermediate. 

31 10. The non-naturally occurring nucleic acid molecule of claim 3, wherein the 

32 splicing intermediate is a thioester intermediate. 

33 ii. The non-naturally occurring nucleic acid molecule of claim 3 , wherein the 

34 splicing intermediate is a lariat intermediate. 
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35 12. A non-naturally occurring nucleic acid molecule encoding a polypeptide 

36 comprising a first portion of a split intein, a second portion of a split intein, a third portion 

37 of a split intein, and fourth portion of a split intein, wherein a first target peptide is 

38 interposed between the first portion of a split intein and the second portion of a split intein, 

39 and a second target peptide is interposed between the third portion of a split intein and the 

40 fourth portion of a split intein. 

41 13. The non-naturally occurring nucleic acid molecule of claim 1 2 wherein the 

42 first portion of a split intein is complementary to the third portion of a split intein but not 

43 complementary to the second portion of a split intein, and the second portion of a split intein 

44 is complementary to the fourth portion of a split intein but not complementary to the third 

45 portion of a split intein. 
46 

47 14. An expression vector comprising a nucleic acid molecule that encodes a 

48 polypeptide comprising a first portion of a split intein, a second portion of a split intein, and 

49 a target peptide interposed between the first portion of a split intein and the second portion 

50 of a split intein, wherein expression of the nucleic acid molecule in a host system produces 

51 the polypeptide in a form selected from the group consisting of: (a) a polypeptide that 

52 spontaneously splices in the host system to yield a cyclized form of the target peptide, and 

53 (b) a splicing intermediate of a cyclized form of the target peptide. 

54 15. The expression vector of claim 14, wherein the polypeptide is a 

55 polypeptide that spontaneously splices in the host system to yield a cyclized form of the 
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56 target peptide. 

57 16. The expression vector of claim 14, wherein the polypeptide is a splicing 

58 intermediate of a cyclized form of the target peptide. 

59 17. The expression vector of claim 1 5, wherein the nucleic acid molecule 

60 further comprises a regulatory sequence that facilitates expression of the polypeptide in the 

61 host system. 

62 18. The expression vector of claim 14, wherein the nucleic acid molecule 

63 further comprises a nucleotide sequence that encodes a peptide that facilitates screening of 

64 the cyclized form of the target peptide for a particular characteristic. 

65 19. The expression vector of claim 14, wherein the nucleic acid molecule 

66 further comprises a nucleotide sequence that encodes a peptide that facilitates purifying the 

67 cyclized form of the target peptide from the host system. 
68 

69 20. The expression vector of claim 14, wherein the target peptide has a first 

70 end fused to the first portion of a split intein and a second end fused to the second portion 

71 of a split intein. 

72 21 . The expression vector of claim 14, wherein both the first portion of a split 

73 intein and the second portion of a split intein are derived from a natxiratty^ccurring split 
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74 intein. 

75 22. The expression vector of claim 21 , wherein both the first portion of a split 

76 intein and the second portion of a split intein are derived from Ssp DnaE. 

77 23 . The expression vector of claim 1 4, wherein at least one of the first portion 

78 of a split intein and the second portion of a split intein is derived from a non-naturally 

79 occurring split intein. 

80 24. The expression vector of claim 23, wherein the non-naturally occurring split 

81 intein is derived from the group consisting of RecA, DnaB, Psp Pol-I, and Pfu inteins. 

82 25. The expression vector of claim 14, wherein both the first portion of a split 

83 intein and the second portion of a split intein are derived from a non-naturally occurring split 

84 intein. 

85 26. The expression vector of claim 16, wherein the splicing intermediate is a 

86 active intein intermediate. 

87 27. The expression vector of claim 16, wherein the splicing intermediate is a 

88 thioester intermediate. 

89 28. The expression vector of claim 1 6, wherein the splicing intermediate is a 
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90 lariat intermediate. 

91 29. The expression vector of claim 1 4, wherein the host system comprises a 

92 prokaryotic cell. 

93 30. The expression vector of claim 29, wherein the prokaryotic cell is a 

94 bacterium. 

95 3i. The expression vector of claim 30, wherein the bacterium is Escherichia 

96 colt 

97 32. The expression vector of claim 14, wherein the host system comprises a 

98 eukaryotic cell. 

99 33 . The expression vector of claim 32 , wherein the eukaryotic cell is a yeast. 
100 

101 34. The expression vector of claim 33, wherein the eukaryotic cell is a 

102 mammalian cell. 

103 35. The expression vector of claim 14, wherein the host system comprises an 

104 archaebacterium. 



105 



36. 



The expression vector of claim 14, wherein the host system comprises a 
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106 plant cell. 

107 37 The expression vector of claim 14, wherein the vector is a plasmid. 

108 38 The expression vector of claim 14, wherein the vector is a bacteriophage. 

10 9 39. The expression vector of claim 14, wherein the vector is a virus. 

I iQ 40. The expression vector of claim 14, wherein the vector is a linear nucleic 

111 acid molecule. 

H2 41. A substantially pure polypeptide comprising a first portion of a split intein, 

113 an second portion of a split intein, and a target peptide interposed between the first portion 

114 of a split intein and the second portion of a split intein, wherein the polypeptide is selected 

1 1 5 from the group consisting of: (a) a polypeptide that spontaneously splices in the host system 

116 to yield a cyclized form of the target peptide, and (b) a splicing intermediate of a cyclized 

117 form of the target peptide. 

118 42. The polypeptide of claim 41 , wherein the polypeptide is a polypeptide that 

119 spontaneously splices in the host system to yield a cyclized form of the target peptide. 

1 43 . The polypeptide of claim 41 , wherein the polypeptide is a splicing 

2 intermediate of a cyclized form of the target peptide. 
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1 44. The polypeptide of claim 41 , wherein the target peptide has a first end 

2 fused to the first portion of a split intein and a second end fused to the second portion of a 

3 split intein. 

4 45. The polypeptide of claim 41, wherein both the first portion of a split intein 

5 and the second portion of a split intein are derived from a naturally-occurring split intein. 

6 46. The polypeptide of claim 45, wherein both the first portion of a split intein 

7 and the second portion of a split intein are derived from Ssp DnaE. 

8 47. The polypeptide of claim 41 , wherein at least one of the first portion of a 

9 split intein and the second portion of a split intein is derived from a non-naturally occurring 

10 split intein. 

11 48 . The polypeptide of claim 47, wherein the non-naturally occurring split intein 

12 is derived from the group consisting of RecA, DnaB, Psp Pol-I, and Pfu inteins. 

13 49. The polypeptide of claim 41, wherein both the first portion of a split intein 

14 and the second portion of a split intein are derived from a non-naturally occurring split 

15 intein. 

16 50. The polypeptide of claim 43, wherein the splicing intermediate is a active 

17 intein intermediate. 
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18 51. The polypeptide of claim 43 , wherein the splicing intermediate is a thioester 

19 intermediate. 

20 52. The polypeptide of claim 43, wherein the splicing intermediate is a lariat 

21 intermediate. 

22 53. A host system comprising a non-naturally occurring nucleic acid molecule 

23 encoding a polypeptide comprising a first portion of a split intein, a second portion of a 

24 split intein, and a target peptide interposed between the first portion of a split intein and the 

25 second portion of a split intein; 

26 wherein expression of the nucleic acid molecule in the host system produces the 

27 polypeptide in a form selected from the group consisting of: (a) a polypeptide that 

28 spontaneously splices in the host system to yield a cyclized form of the target peptide, and 

29 (b) a splicing intermediate of a cyclized form of the target peptide. 

30 54. The host system of claim 53, wherein the polypeptide is a polypeptide that 

3 1 spontaneously splices in the host system to yield a cyclized form of the target peptide. 



32 55. The host system of claim 53, wherein the polypeptide is a splicing 

33 intermediate of a cyclized form of the target peptide. 

34 56. The host system of claim 53, wherein the host system comprises a 

35 prokaryote. 
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36 57. The host system of claim 56, wherein the prokaryote is a bacterium. 

37 58. The host system of claim 53, wherein the host system comprises an 
3 8 archaebacterium. 

39 59. The host system of claim 53, wherein the host system comprises a 

40 eukaryote. 

41 60. The host system of claim 59, wherein the eukaryote is a yeast. 

42 61 . The host system of claim 59, wherein the eukaryote is a mammalian cell. 

43 , , 62. Thehost system of claim 53, wherein the host system comprises a plant 

44 cell. 

45 63 . A method for making a peptide molecule, the method comprising the steps 

46 of: 

47 providing an isolated nucleic acid molecule that encodes a polypeptide comprising 

48 a first portion of a split intein, a second portion of a split intein, and a target peptide 

49 interposed between the first portion of a split intein and the second portion of a split intein, 

50 wherein expression of the nucleic acid molecule in a host system produces the peptide 

51 molecule in a form selected from the group consisting of: (a) a cyclized form of the target 

52 peptide resulting from spontaneously splicing of the polypeptide in the host system, and (b) 

53 a splicing intermediate of a cyclized form of the target peptide; 
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54 providing the host system; 

55 introducing the isolated nucleic acid molecule into the host system; and 

56 expressing the isolated nucleic acid molecule. 

57 64. The method of claim 63, wherein the step of expressing the isolated nucleic 

58 acid molecule results in production of a polypeptide that spontaneously splices in the host 

59 system to yield the cyclized form of the target peptide. 

60 65. The method of claim 64 further comprising the step of purifying the cyclized 

6 1 form of the target peptide from the host system. 

62 66. The method of claim 63 , wherein the step of expressing the isolated nucleic 

63 acid molecule results in production of a splicing intermediate of a cyclized form of the target 

64 peptide. 

65 67. The method of claim 66 further comprising the step of purifying the splicing 

66 intermediate of a cyclized form of the target peptide from the host system. 

67 68. The method of claim 66, wherein the splicing intermediate is an active 

68 intein intermediate. 

69 69. The method of claim 66, wherein the splicing intermediate is a thioester 

70 intermediate. 
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71 70. The method of claim 66, wherein the splicing intermediate is a lariat 

72 intermediate. 

73 71. The method of claim 66, further comprising the step of forming the cyclic 

74 peptide from the splicing intermediate. 

75 72 . The method of claim 63, wherein the isolated nucleic acid molecule is 

76 incorporated into an expression vector that facilitates expression of the isolated nucleic 

77 acid molecule in the host system. 

78 73 . The method of claim 72, wherein the expression vector is a plasmid. 

79 74. The method of claim 72, wherein the expression vector is a bacteriophage. 

80 75 . The method of claim 72, wherein the expression vector is a virus. 

81 76. The method of claim 63, wherein the host system comprises a prokaryotic 

82 cell. 

83 77. The method of claim 76, wherein the prokaryotic cell is a bacterium. 

84 78. The method of claim 77, wherein the bacterium is Escherichia coli. 
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85 79. The host system of claim 63, wherein the host system comprises an 

86 archaebacterium. 

87 80. The method of claim 63, wherein the host system comprises a eukaryotic 

88 cell. 

89 81. The method of claim 80, wherein the eukaryotic cell is a yeast. 



82. The method of claim 80, wherein the eukaryotic cell is a mammalian cell. 



83. The method of claim 63, wherein the host system comprises a plant cell. 



92 84. The method of claim 63, wherein the host system comprises an in vitro 

93 transcription/translation system. 

94 85. The method of claim 84, wherein the in vitro transcription/translation system 

95 comprises a cell lysate. 

96 86. The method of claim 64, wherein the production of the target peptide in 

97 cyclized form occurs in the host system in the absence of an exogenously-added agent. 

98 87. The method of claim 86, wherein the exogenously-added agent is a 

99 protease. 
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100 88. The method of claim 86, wherein the exogenously-added agent is a thiol. 

101 89. The method of claim 72, wherein the expression vector is inducible. 

1 90. A method of preparing a library of peptide molecules, the method 

2 comprising the steps of: 

3 providing a plurality of nucleic acid molecules encoding a plurality of target peptides 

4 having heterogenous amino acid sequences; 

5 incorporating each of the plurality of nucleic acid molecules into an expression 

6 vector to form a plurality of expression vectors, whereby each of the plurality of nucleic 

7 acid molecules is interposed between a nucleic acid molecule encoding a first portion of a 

8 split intein and a nucleic acid molecule encoding an second portion of a split intein in each of 

9 the formed expression vectors, wherein expression of the expression vectors in a host 

10 system results in the production of a plurality of peptide molecules in a form selected from 

1 1 the group consisting of: (a) polypeptides that spontaneously splice in the host system to 

12 yield cyclized forms of the target peptides, and (b) splicing intermediates of cyclized forms 

13 of the target peptides; and 

14 expressing the expression vectors in the host system. 

15 91 . The method of claim 90, wherein the plurality of polypeptides are 

16 polypeptides that spontaneously splice in the host system to yield cyclized forms of the 

1 7 target peptides. 
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18 92. The method of claim 90, wherein the plurality of polypeptides are splicing 

19 intermediates of cyclized forms of the target peptides 

20 93. The method of claim 90, wherein the plurality of nucleic acid molecules 

21 encoding a plurality of target peptides are produced by solid phase synthesis. 

22 94. The method of claim 90, wherein the plurality of nucleic acid molecules 

23 encoding a plurality of target peptides are produced using polymerase chain reaction. 

24 95 . The method of claim 90, wherein the plurality of nucleic acid molecules 

25 encoding a plurality of target peptides are produced by enzymatically digesting a larger 

26 nucleic acid molecule. 

27 96. The method of claim 95, wherein the larger nucleic acid molecule is 

28 derived from an organism. 

29 97. The method of claim 90, wherein the plurality of nucleic acid molecules 

30 encoding a plurality of target peptides are produced from a progenitor nucleic acid 

3 1 molecule that has been amplified under conditions which introduce mutations into the 

32 progenitor nucleic acid molecule's nucleotide sequence. 

33 98. A method of screening a peptide molecule for a predetermined 

34 characteristic, the method comprising the steps of: 
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35 providing a nucleic acid molecule that encodes a polypeptide comprising a first 

36 portion of a split intein, a second portion of a split intein, and a target peptide interposed 

37 between the first portion of a split intein and the second portion of a split intein, wherein 

38 expression of the nucleic acid molecule in a host system produces the peptide molecule in a 

39 form selected from the group consisting of: (a) a cyclized form of the target peptide resulting 

40 from spontaneously splicing of the polypeptide in the host system, and (b) a splicing 

41 intermediate of a cyclized form of the target peptide; 

42 providing the host system; 

43 introducing the isolated nucleic acid molecule in the host system; 

44 placing the host system under conditions that cause the peptide molecule to be 

45 produced; and 

46 testing the peptide molecule for the predetermined characteristic. 

47 99. The method of claim 98, wherein the peptide molecule is a cyclized form of 

48 the target peptide. 

49 100. The method of claim 98, wherein the peptide molecule is a splicing 

50 intermediate of a cyclized form of the target peptide. 

51 101. The method of claim 98, wherein the predetermined characteristic 

52 comprises the ability to specifically bind a target molecule, and the step of testing the 

53 peptide molecule for the predetermined characteristic comprises the steps of (a) contacting 

54 the peptide molecule to the target molecule and (b) determining whether the peptide 
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55 molecule binds to the target molecule. 

1 102. The method of claim 101, wherein the step of determining whether the 

2 peptide molecule binds to the target molecule is measured by observing a color change. 

3 103. The method of claim 101, wherein the step of determining whether the 

4 peptide molecule binds to the target molecule is measured by observing a fluorescent 

5 signal. 

1 1 04. The method of claim 101, wherein the step of determining whether the 

2 peptide molecule binds to the target molecule is measured by analyzing the cell cycle of an 

3 organism. 

4 105. The method of claim 101, wherein the step of determining whether the 

5 peptide molecule binds to the target molecule is measured by analyzing the reproduction 

6 of an organism. 

1 106 The method of claim 101, wherein the target molecule is a cell-associated 

2 molecule. 

3 1 07. The method of claim 1 06, wherein the cell-associated molecule is a 

4 membrane-associated molecule. 
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1 1 08. The method of claim 1 06, wherein the cell-associated molecule is an 

2 intracellular molecule. 
3 

4 1 09. The method of claim 108, wherein the intracellular molecule is a nuclear 

5 molecule. 

6 110. The method of claim 108, wherein the intracellular molecule is an organelle. 



7 111. The method of claim 110, wherein the organelle is selected from the group 

8 consisting of: mitochondria, lysosomes, endoplasmic reticula, chloroplasts, golgi; and 

9 periplasm. 



1 112. The method of claim 101, wherein the target molecule is an extracellular 

2 molecule. 



3 113. The method of claim 98, wherein the predetermined characteristic is the 

4 ability to modulate a biochemical reaction, and the step of testing the peptide molecule for 

5 the predetermined characteristic comprises the steps of (a) contacting the peptide molecule 

6 to a system containing the biochemical reaction and (b) determining whether the peptide 

7 molecule modulates the biochemical reaction. 

1 1 14. The method of claim 1 13, wherein the step of determining whether the 

2 peptide molecule modulates the biochemical reaction is measured by observing a color 
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3 change. 

4 115. The method of claim 113, wherein the step of determining whether the 

5 peptide molecule modulates the biochemical reaction is measured by observing a 

6 fluorescent signal. 

7 116. The method of claim 113, wherein the step of determining whether the 

8 peptide molecule modulates the biochemical reaction is measured by analyzing the cell 

9 cycle of an organism. 

1 117. The method of claim 113, wherein the step of determining whether the 

2 peptide molecule modulates the biochemical reaction is measured by analyzing the 

3 reproduction of an organism. 

4 118. The method of claim 113, wherein the biochemical reaction is an a cell- 

5 associated process. 119. The method of claim 118, wherein the biochemical reaction 

6 is an intracellular metabolic event. 

7 1 20. The method of claim 118, wherein the biochemical reaction is a membrane- 

8 associated event. 

1 121. The method of claim 118, wherein the biochemical reaction is a nuclear 

2 event. 
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3 122. The method of claim 1 13, wherein the biochemical reaction is a 

4 extracellular reaction. 

5 123. The method of claim 98, wherein the step of testing the peptide molecule 

6 for the predetermined characteristic is performed using a hybrid system. 

1 124. The method of claim 98, further comprising the step of immobilizing the 

2 peptide molecule on a solid phase support. 

3 125. A method for purifying a cyclic peptide from a mixture, the method 

4 comprising the steps of: 

5 providing a mixture containing a splicing intermediate conjugated with an affinity tag; 

6 mixing the conjugated splicing intermediate with a solid phase support having a 

7 ligand thereon that specifically binds the affinity tag whereby the support becomes 

8 specifically bound with the splicing intermediate; 

1 washing the support to remove non-specifically bound matter from the support; 

2 adding to the support a reagent that makes a cyclic peptide from the splicing 

3 intermediate; and 

1 eluting the cyclic peptide from the support. 

2 126. A method for purifying a cyclic peptide from a mixture, the method 

3 comprising the steps of: 

4 providing a mixture containing a splicing intermediate conjugated with an affinity tag; 

5 mixing the conjugated splicing intermediate with a solid phase support having a 



SUBSTITUTE SHEET (RULE 26) 



WO 00/36093 PCT/US99/30162 

73 



6 ligand thereon that specifically binds the affinity tag whereby the support becomes 

7 specifically bound with the splicing intermediate; 

8 washing the support to remove non-specifically bound matter from the support; 

1 eluting the splicing intermediate from the support; and 

2 adding a reagent the eluted splicing intermediate that make a cyclic peptide from 

3 the splicing intermediate. 
4 

5 1 27. A method for purifying a target molecule that binds a splicing intermediate 

6 from a mixture, the method comprising the steps of: 

7 providing a solid phase support having the splicing intermediate specifically bound 

8 thereon; 

1 contacting the support with the target molecule in the mixture; 

2 washing the support to remove non-specifically bound matter from the support; 

3 and 

4 eluting the target molecule from the support. 
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